Implementing printf and scanf in C

Do not miss this exclusive book on Binary Tree Problems. Get it now for free.

Printf and scanf are the functions used most frequently in C language for input and output. These work on console stdin and stdout files respectively hence, these two works works on files and are file operators. Despite the common use, both printf and scanf works in unique ways understnading which will bring a whole new prespective and enable coders to write advance level secure code.

The file version of these functions are:

  • fprintf
  • fscanf

The string versions are:

  • sscanf
  • sprintf

The I/O functions for the file and string version of printf and scanf operate on console/file/strings that is different files but they all work on the same principle.

Key concepts involved are:

  • Variable number of arguments in functions using vararg in C.
  • Use of internal buffer to prepare the input or output.

Vararg: Variable argument functions


Printf and scanf are two functions that you will encounter to use any number of arguments. At first, it may seems to be an advance level trick but in fact, you can modify your functions as well to accept variable number of arguments. The answer is vararg.

We demonstrate this by an average() function that will accept any number of arguments. Example of using Vararg for average() function is as follows:


#include <stdarg.h>
#include <stdio.h>

double average(int count, ...)
{
    va_list ap;
    int j;
    double sum = 0;

    va_start(ap, count); /* Requires the last fixed parameter (to get the address) */
    for (j = 0; j < count; j++) {
        sum += va_arg(ap, int); /* Increments ap to the next argument. */
    }
    va_end(ap);

    return sum / count;
}

int main(int argc, char const *argv[])
{
	printf("%f\n", average(3, 1, 2, 3) );
	return 0;
}

In the above code, the function average() can take in any number of input parameters.

Key points to note:

  • Arguments are passed on the stack
  • The va_start function contains the code to initialize the va_list with the correct stack pointer. It must be passed the last named argument in the function declaration or it will not work.
  • va_arg uses this saved stack pointer and extract the correct amount of bytes for the type provided, and then modify ap so it points to the next argument on the stack.
  • va_end holds to pointer to the end of the stack
  • functions (va_start, va_arg and va_end) are implemented as preprocessor macros. The actual implementation also depends on the compiler, as different compilers can have different layout of the stack and how it pushes arguments on the stack.

Printf working principle

  • Printf takes multiple arguments using vararg function.

  • User supply a string and input arguments. Like printf("Hello, my name is %s having an id %d", name, id);

  • Printf creates an internal buffer for constructing output string.

  • Now printf iterates through each characters of user string and copies the character to the output string. Printf only stops at %. % means there is an argument to convert. Arguments are in the form of char, int, long, float, double or string. It converts it to string and appends to output buffer. If the argument is string then it does a string copy.

  • Finally printf may reach at the end of user sting and it copies the entire buffer to the stdout file.

Consider this C code that demonstrates the internal functionality of printf:


#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <string.h>
int print (char * str, ...)
{
	va_list vl;
	int i = 0, j=0;
		char buff[100]={0}, tmp[20];
		va_start( vl, str ); 
		while (str && str[i])
		{
		  	if(str[i] == '%')
		  	{
 		    i++;
 		    switch (str[i]) 
 		    {
	 		    case 'c': 
	 		    {
	 		        buff[j] = (char)va_arg( vl, int );
	 		        j++;
	 		        break;
	 		    }
	 		    case 'd': 
	 		    {
	 		        itoa(va_arg( vl, int ), tmp, 10);
	 		        strcpy(&buff[j], tmp);
	 		        j += strlen(tmp);
		           break;
		        }
		        case 'x': 
		        {
		           itoa(va_arg( vl, int ), tmp, 16);
		           strcpy(&buff[j], tmp);
		           j += strlen(tmp);
		           break;
		        }
        	}
     	} 
     	else 
	    {
	       	buff[j] =str[i];
	       	j++;
	    }
	    i++;
	} 
    fwrite(buff, j, 1, stdout); 
    va_end(vl);
    return j;
 }
 

Scanf working principle


Key points:

  • Scanf is reverse process of printf
  • Scanf reads console input string
  • It iterates each characters of user provided string and stops at "%". Now scanf reads a line from stdin. User's input comes as a string.
  • It converts string to char, int, long, float, double and sets the value of the pointer located at the argument. In care of string it simply copies the string to the output.

Consider this C code that demonstrates the internal functionality of scanf:


#include <stdio.h>
#include <stdlib.h>
#include<stdarg.h>
int scan (char * str, ...)
{
    va_list vl;
    int i = 0, j=0, ret = 0;
    char buff[100] = {0}, tmp[20], c;
    char *out_loc;
    while(c != '') 
    {
        if (fread(&c, 1, 1, stdin)) 
        {
 	       buff[i] = c;
 	       i++;
 	    }
 	}
 	va_start( vl, str );
 	i = 0;
 	while (str && str[i])
 	{
 	    if (str[i] == '%') 
 	    {
 	       i++;
 	       switch (str[i]) 
 	       {
 	           case 'c': 
 	           {
	 	           *(char *)va_arg( vl, char* ) = buff[j];
	 	           j++;
	 	           ret ++;
	 	           break;
 	           }
 	           case 'd': 
 	           {
	 	           *(int *)va_arg( vl, int* ) =strtol(&buff[j], &out_loc, 10);
	 	           j+=out_loc -&buff[j];
	 	           ret++;
	 	           break;
 	            }
 	            case 'x': 
 	            {
	 	           *(int *)va_arg( vl, int* ) =strtol(&buff[j], &out_loc, 16);
	 	           j+=out_loc -&buff[j];
	 	           ret++;
	 	           break;
 	            }
 	        }
 	    } 
 	    else 
 	    {
 	        buff[j] =str[i];
            j++;
        }
        i++;
    }
    va_end(vl);
    return ret;
}
int main(int argc, char *argv[])
{
	char c;
	int i;
	int h;
	int ret = 0;
	ret = scan("%c %d %x", &c, &i, &h);
	print("C = %c, I = %d, H = %X, Return %d", c, i, h, ret);
	return 0;
}

Conclusion


In the theoretical sense, the formatting parameters used by printf and scanf, and other related functions, can be defined using a model called a Context Free Grammar, which is a means of formalising the rules of computer languages, including the C language itself. Due to the simplicity of the the language it is probably not often implemented this way but it may be helpful to use this as a starting point to understand the processes involved.

Sign up for FREE 3 months of Amazon Music. YOU MUST NOT MISS.