If you have not known before, scanf(3)
and fgets(3)
are both functions intended for reading something from standard input and doing something with the result – in the former case it is interpreted and the results might possibly be stored in specified arguments and in the latter case the result is simply put into a buffer. The first one is commonly recommended to beginner C programmers for reading something from the user and parsing it. However, it should be avoided mainly because it is very error-prone and it is difficult to understand how it actually works for new people. Instead of scanf(3)
, sscanf(3)
should be used in combination with fgets(3)
. That way, you can easily guess in what state your standard input stream is after using those functions and you get other benefits. Let me give some examples and explain more.
The following two examples are functionally the same – they both read two numbers from standard input and print them. However, the first one uses scanf(3)
and the second one uses fgets(3)
. Here they are:
#include <stdio.h> int main(int argc, char *argv[]) { int ret, num; ret = scanf("%d", &num); if (ret == 1) printf("%d\n", num); return 0; }
#include <stdio.h> int main(int argc, char *argv[]) { char buf[100]; if (buf == fgets(buf, 100, stdin)) { int num, ret; ret = sscanf(buf, "%d", &num); if (ret == 1) printf("%d\n", num); } return 0; }
You can try them out for yourself. Let us go through the list of differences. I think the best way to illustrate them is to go through a list of different kinds of inputs that the user maybe provide to these programs and see what are the differences:
- Valid input. Let us say the user entered “123\n”.
scanf
would read “123” fromstdin
and it would leave the “\n” in the stream. However,fgets
would eat the “\n” as well. This is where the first problem occurs: new C programmers tend to think thatscanf
would read the “\n” as well or they do not realise that it is still there. This might not pose an issue if yourstdin
is not line buffered (e.g. when piping a file into a program) however most of the time it is – as far as I know most of the terminals wait for the user to press “Enter” before sending the user’s input to a program in the foreground. On the other hand,fgets
would leavestdin
in a predictable state – you will always know that if ‘\n’ existed instdin
then it was eaten (unless your buffer is too small). And it is very easy to check this hypothesis – just check if the last character in your buffer is ‘\n’. Also, you know that you need to read more characters (digits) fromstdin
to get the full number if the last character is not ‘\n’ and you still have something to read fromstdin
. - Valid input but with some whitespace at the beginning.
scanf
automagically skips over it but howeverfgets
presents an opportunity for you to see what kind of whitespace was at the beginning, before the actual number. Buffer size becomes an important question in this case. As always, read until you got the full line. This might or might not be useful depending on your case. In general,fgets
in this case introduces more transparency in the process. - Invalid input. This is the case when it does not match the format specifier, characters are read from
stdin
but they are not brought back.scanf
might accidentally eat a character fromstdin
and it would be gone into the dark abyss unless it was stored in a buffer somewhere. If I remember correctly, it disappears as long as you read it from stdin but maybe on some Unix you can read it back again. This raises a confusion for the user because the user does not know how much was read fromstdin
exactly. The other function combination lets you know exactly what was read fromstdin
. This might cause even more confusion when two or more pairs ofscanf
are used one after the other because it is hard to know what is left instdin
after the preceding calls toscanf
.
In general, my recommendation is this: avoid using scanf
unless you can absolutely control what the standard input is going to be and what will be its format. Also, besides all of the aforementioned problems, scanf
does present some security issues. For example, the format "%s"
lets the user input any length string into a specified char *
. A malicious user can easily use this to over-run the buffer and write any arbitrary data to memory. I hope you will take this into account the next time you will write code such as this.