PR2EN7: Dynamic memory allocation

Lab materials

Lab tasks

For this week, you will be creating a random data generator as the base task. This task is extended by two extra tasks. Similarly built generators are used in multiple places throughout the course to generate data files.

NB! One recommendation to generate data files has been to use LLMs for this. While good for short files, it doesn’t work well (or gets very costly) if you wish to generate millions of lines of complex data.

Task 1 [W07-1]: Random data file generator

The first task is to generate a program to generate random data files. These are useful for testing other applications without having access to real data.

Download the starter code: https://blue.pri.ee/ttu/files/iax0584/aluskoodid/7_generator_starter.zip

Requirements
  • Build your application on the starter code provided
  • Ask the user how many entries they wish to generate. There must be no upper bound!
  • All entries are stored in a struct array during generation. Struct array itself is generated using dynamic memory allocation.
  • Pick a random first and last name and curriculum code from the pools of predefined values
  • Generate admission points as a random number (e.g. 24.7)
    • Range must be from 10.0 to 30.0, ends inclusive.
    • Precision must be  0.1 points
  • Sort the generated entries based on the last name. If last names for two entries match, order those by their first name.
  • Write the output to a file in the following format:
    <index> <last name> <first name> <curriculum code> <admission points>
    • Index is a unique integer. First entry will have the index as 0, every following one is incremented by one
  • Make sure that all resources (including memory) is freed when the program exits, use valgrind to make sure.
Workflow
  • Write the necessary structure declaration for the data in the header file.
  • Ask the user how many entries they want to generate
  • Allocate the memory, check for allocation failures
  • Generate the necessary entries
    • Every person will have all their members generated randomly
    • For the struct members that are picked out of the pools, you must generate a random number to pick it out
      Hint: you can either copy the name to your struct or only keep a pointer to the name
    • Generate the admission points (10.0 <= points <= 30.0, precision 0.1)
      Hint: Think about this mathematically – e.g. what’s the difference between 30 and 300!? rand() function will always give you an int, regardless if what you try to do to it.
  • Sort the array
    • Think why is it bad idea to use bubble/selection/insertion sort here!
  • Write the results to an output file
  • Free the dynamically allocated memory

Check with Valgrind for correctness! Not only in the end, but also if you encounter some weirdness, crashes or corruption!

Qsort comparison function

I’m proposing two different comparison function options. Pick the one that you understand better or write your own.

In the first case, the type cast will be done as needed, removing the need for additional variables.

In the second case, temporary pointers are used that take will up some memory, but improve the readability of the code.

Testing

The output of your application should be relatively simple and short.

Then take a look at the output file and make sure the results are correct. You will have a different result because all fields will be generated randomly.

Extra task 1 [W07-2]: Output file formats

In this task, you will need to add CSV as a secondary output format for our application. User must be able to choose which output format they want.

Requirements
  • Add ability to your program to generate data in the CSV format
    • First line of the CSV must be the header with the field names.
    • This will be followed by data lines. Each member on a line is separated by a comma. NB! Do not put a space after the comma for CSV file!
  • Ask the user for which output file format they need (CSV or space delimited). Generate the appropriate type of file.
  • For the space-delimitted file, the extension must be .txt  and for the CSV file, use the extension  .csv .
  • Make sure that the CSV is correctly generated – try to open (or import) it using  Libreoffice Calc’i  or Microsoft Office and see if they recognize the format correctly.

Extra task 2 [W07-3]: Settings

In this task, you will need to make the generation a bit more flexible. This require adding options to select which fields are generated and what the points range is going to be.

Requirements
  • All settings must be kept in a structure – create a new structure
  • All settings must have default values
  • Program must use the defaults if the user does not wish to change the settings. What the user needs to do to alter the settings is up to you
    E.g. you can ask inside of the program if they wish to alter the settings or use command line arguments
  • User must be able to alter the following settings without editing the source code
    • Which data fields are generated (it must be possible to turn each one of them on or off).
    • Name of the output file (only the name part. Extension must be chosen automatically based on the output format!)
    • Output format (this is from extra task 1, move the location of the setting into the struct).
    • Number of items generated (from the base task, move the variable for the setting to the struct)
    • Lower rand upper bounds for the admission points
  • NB! The number of entries generated must be asked regardless if the user wished to change the settings or not. Purpose of this is just to keep all settings neatly in one struct.
  • User must be shown the settings the generation will be performed under regardless if they chose to edit it or not.

Note: if you wish, you can make a settings file, however it is not necessary. The alterations to the default settings do not need to be stored between executions. However if you do include a settings file, make sure that the user can change those settings within the program without editing the file manually.

Hint: You can make nice use of the ternary operator here
printf("First name: %10s\n", settings.genFirstName ? "Yes": "No");

After this class, you should

  • Know how to use dynamic memory allocation
  • Know how to check for memory leaks
  • Know in which situations it is reasonable to use dynamic memory and in which situations you should not
  • Know the difference between function call stack and heap
  • Know the differences between malloc and calloc

Additional materials