Skip to content

Counting the number of characters in comments line in C

An answer to this question on Stack Overflow.

Question

I need to count the characters in the comments from a C program, that is supplied as a standard input. This is my function, but for some reason it's not counting right. Can you help me please?

int characters(FILE *file)
{
     int i=0;
     char ch[500], *p;
     while (fgets(ch, sizeof(ch),file)!=NULL)
     {
        p=ch;
        while (*p)
        { 
           if (*p=='/')
           { 
              p++;
              if (*p=='*')
              {
                 p++;
                 while (*p!='*' && *(p++)!='/')
                 {
                    i++;
                    p++;
                 }
              }
           }
           else
              p++;
         }
   
   return i;
}

Answer

Your code scared me.

There were a lot of pointers going on, and nested loops.

It's easy to get the logic wrong in there, and the code is difficult to extend if you need to make changes.

Might I suggest a different solution?

A state machine!

We'll read in the file one character at a time and keep track of what state the machine is in. We'll then use this to decide whether or not we are in a comment.

#include <cstdio>
#define S_CODE		    1
#define S_ONESLASH		2
#define S_LINECOMMENT	3
#define S_BLOCKCOMMENT	4
#define S_BLOCKSTAR		5
int characters(FILE *file){
	int ccount=0;
	char ch;
	int state=S_CODE;
     while ((ch=fgetc(file))!=EOF){
		switch(state){
			case S_CODE:
				if (ch=='/')
					state=S_ONESLASH;
				break;
			case S_ONESLASH:
				if (ch=='/')
					state=S_LINECOMMENT;
				else if (ch=='*')
					state=S_BLOCKCOMMENT;
				else
					state=S_CODE;
				break;
			case S_LINECOMMENT:
				if (ch=='\n')
					state=S_CODE;
				else
					ccount++;
				break;
			case S_BLOCKCOMMENT:
				if (ch=='*')
					state=S_BLOCKSTAR;
				ccount++;
				break;
			case S_BLOCKSTAR:
				if (ch=='/')
					state=S_CODE;
				else if (ch=='*')
					state=S_BLOCKSTAR;
				else
					state=S_CODE;
				ccount++;
				break;
		}
	}
	return ccount;
}
int main(int argc, char **argv){
	FILE *fin=fopen(argv[1],"r");
	printf("%d\n",characters(fin));
}

Notice how we use the characters / and * and \n to mark transitions between the machine's different states and how in some states we increment the comment-character counter, but in others not. I think it's much easier to keep track of what's going on here.