Playing formatted strings from the file in Array in C

advertisements

I am new to the C programming language and trying to improve by solving problems from the Project Euler website using only C and its standard libraires. I have covered basic C fundamentals(I think), functions, pointers, and some basic file IO but now am running into some issues.

The question is about reading a text file of first names and calculating a "name score" blah blah, I know the algorithm I am going to use and have most of the program setup but just cannot figure out how to read the file correctly.

The file is in the format "Nameone","Nametwo","billy","bobby","frank"... I have searched and searched and tried countless things but cannot seem to read these as individual names into an array of strings(I think thats the right way to store them individually?) I have tried using sscanf/fscanf with %[^\",]. I have tried different combos of those functions and fgets, but my understanding of fgets is everytime I call it it will get a new line, and this is a text file with over 45,000 characters all on the same line.

I am unsure if I am running into problems with my misunderstanding of the scanf functions, or my misunderstanding with storing an array of strings. As far as the array of strings goes, I (think) I have realized that when I declare an array of strings it does not allocate memory for the strings themselves, something that I need to do. But I still cannot get anything to work.

Here is the code I have now to try to just read in some names I enter from the command line to test my methods.

This code works to input any string up to buffer size(100):

int main(void)
{
   int i;
   char input[100];
   char* names[10];

   printf("\nEnter up to 10 names\nEnter an empty string to terminate input: \n");

   for(int i = 0; i < 10; i++)
   {
      int length = 0;
      printf("%d: ", i);
      fgets(input, 100, stdin);
      length = (int)strlen(input);
      input[length-1] = 0;        // Delete newline character
      length--;

      if(length < 1)
      {
         break;
      }

      names[i] = malloc(length+1);
      assert(names[i] != NULL);
      strcpy(names[i], input);
   }
}

However, I simply cannot make this work for reading in the formatted strings.

PLEASE advise me as to how to read it in with format. I have previously used sscanf on the input buffer and that has worked fine, but I dont feel like I can do that on a 45000+ char line? Am I correct in assuming this? Is this even an acceptable way to read strings into an array?

I apologize if this is long and/or not clear, it is very late and I am very frustrated.

Thank anyone and everyone for helping, and I am looking forward to finally becoming an active member on this site!


There are really two basic issues here:

  1. Whether scanning string input is the proper strategy here. I would argue not because while it might work on this task you are going to run into more complicated scenarios where it too easily breaks.
  2. How to handle a 45k string.

In reality you won't run into too many string of this size but it is nothing that a modern computer of any capacity can't easily handle. Insofar as this is for learning purposes then learn iteratively.

The easiest first approach is to fread() the entire line/file into an appropriately sized buffer and parse it yourself. You can use strtok() to break up the comma-delimited tokens and then pass the tokens to a function that strips the quotes and returns the word. Add the word to your array.

For a second pass you can do away with strtok() and just parse the string yourself by iterating over the buffer and breaking up the comma tokens yourself.

Last but not least you can write a version that reads smaller chunks of the file into a smaller buffer and parses them. This has the added complexity of handling multiple reads and managing the buffers to account for half-read tokens at the end of a buffer and so on.

In any case, break the problem into chunks and learn with each refinement.

EDIT

#define MAX_STRINGS 5000
#define MAX_NAME_LENGTH 30

char* stripQuotes(char *str, char *newstr)
{
    char *temp = newstr;

    while (*str)
    {
        if (*str != '"')
        {
           *temp = *str;
           temp++;
        }

        str++;
    }

    return(newstr);
}

int main(int argc, char *argv[])
{
    char  fakeline[] = "\"Nameone\",\"Nametwo\",\"billy\",\"bobby\",\"frank\"";
    char *token;
    char  namebuffer[MAX_NAME_LENGTH] = {'\0'};
    char *name;
     int  index = 0;
    char  nameArray[MAX_STRINGS][MAX_NAME_LENGTH];

    token = strtok(fakeline, ",");
    if (token)
    {
        name = stripQuotes(token, namebuffer);
        strcpy(nameArray[index++], name);
    }

    while (token != NULL)
    {
        token = strtok(NULL, ",");

        if (token)
        {
            memset(namebuffer, '\0', sizeof(namebuffer));
            name = stripQuotes(token, namebuffer);
            strcpy(nameArray[index++], name);
        }
    }

    return(0);
}