In this lab task, you are going to interface with an SQLite3 database using the libsqlite3-dev library. You start out by adding data to the existing database in the first lab task and continue by querying the data in the second lab task.
The lab has two advanced tasks, first of which switches out the adding of the data. Second one introduces you to updating existing data rows.
NB! The task might not be doable on the P drive due to locking issues with Windows SMB shares (unable to commit changes to the database) . To get around this, store the database file on the local computer (i.e. on Desktop) for the duration of the tasks and move to P drive after completion!
In this part you will practice doing INSERT queries. The base task does not need to be implemented in your C code! You can use the “Execute SQL” feature in the SQLite browser.
Requirements
Write down all the SQL queries you did (just as a text document)! Show the queries once all are performed and show the database. There should be a total of 5 queries!
ID values used for the primary key of each table(students.id ,
subjects.id ,
declarations.id ) must not be hardcoded into your queries! They are generated automatically by the database managem system (those attributes have auto increment as a part of their database design with each table having a corresponding sequence generator).
Perform the following insertions
Add yourself as a student into the database.
Add a subject that you have completed into the database.
Add 3 declarations to yourself. One to the subject that you just created and two more for subjects in the database.
Note: grades and eID do not need to be real.
Once data is added, present the queries you performed and show you database in SQLite Browser.
Lab task 2: Retrieving data
For this lab task, you will need to write queries to retrieve data from the database. All queries need to be performed in the C program you write. You have 3 sample codes provided in the beginning of this page. Pick the one you like the most and use that as a template to writing your own queries.
Task is divided into three separately graded parts.
Requirements
All queries for the part you intend to present must be completed. Each part is graded individually..
Data must be completely prepared on the database engine side – any kind of calculation (sums, averages, max etc) must be written into the SQL query. No calculations in your C code!
All relevant data must be printed. Do not print id values (e.g. subject.id, declaration.id, ..)
Queries can either be performed through a menu selectively or can just execute one after another. Up to you.
All resources must be released by the end of the program, no leaks – check with valgrind.
Task 2 gradable part 1
First part contains basic SELECT queries.
Query 1: Find all subjects that are less than 6 credits
Query 2: Find all subjects that have an exam, ordered by the number of credits from lowest to highest.
Note: Query 1 and 2 are independent of each other. Do not tie them together.
Task 2 gradable part 2
Second part contains more complex SELECT queries where you have to use JOIN and double JOIN to merge data from multiple tables together through their PK-FK key-pairs.
Query 1: Find all grades for students whose first name is “Marko”
Query 2: Find all classes you’ve completed, show with grades, starting from the highest grade.
Note: Query 1 and 2 are independent of each other. Do not tie them together.
Task 2 gradable part 3
In this part we will practice data aggregation for calculating for our SELECT queries. We will use both JOIN, GROUP BY and additional functions for calculating results.
Query: Find each students total earned credits and average grade
Advanced task 1: Adding yourself to the database more reliably
In the base task 1, likely you added data by just observing the values and hardcoding everything in. In a production database, this wouldn’t be feasible.
Information
For a better solution, we have 2 different paths we can take.
SQL supports subqueries – queries that are a part of another query. For an example, we can embed a SELECT query inside of an INSERT query.
For an example, if we want to add another car for Marko Mets in the sample database, we could execute
This of course isn’t perfect since there could be two people named Marko mets. However it will be good enough for the lab.
A better approach would be to insert ourselves as students first. Then get the ID value from that insertion using the query
SELECTlast_insert_rowid(); This will give us the ID value of that last insertion.
In a real database scenario this would be tied together with transactions so that there wouldn’t be race conditions (other people adding data to database before you get your id). However this is again unnecessary for us since we are using SQLite, which is an offline file based database.
Requirements
Write the insertion to the database as a part of your C program code.
If this code is executed again, make sure that a duplicate wouldn’t be created. Check before adding using a unique value (e.g. eID).
You must use the auto-generated ID value value. Hardcoding any IDs into the query is not permitted.
You can use either one of the proposed approaches from the information section.
Advanced task 2 : Changing existing data
In this task, you will practice UPDATE queries, where you will first have to identify the line and then change its attribute(s). You will need to generate and add uni-id’s to all students.
Requirements
Generate all the students Uni-ID’s.
Uni-IDs must be added to the database using UPDATE statements. Populate the uni_id attribute.
Your code must work for any number of students
You must also resolve conflicts if a Uni-ID already exists. Conflict resolution can (and should) be done in C.
This lab has a base task consisting of making two separate programs and comparing the performance of those programs. In addition, it has an advanced task adding a third program to the comparison.
Lab task: linear search vs binary search tree
In the base task, we will create two programs. Both of them will count how many times each name occurred in the data file. For the first program, we will implement a linear search algorithm. For the second one, we will use a binary search tree.
Find all the name combinations (first + last name) in the data file.
Count how many times each name combination occurred in the provided data file
Output the name combinations found
Note: since we are performance-focused, use a global variable for the FILE pointer in the trees task. Do not reopen the file every time you plan to print a node.
Use the following output structure: <first name> <last name> <count>
Provide two implementations for the same task
First one will use a dynamically expanding array, using realloc(), to keep track of the names and their occurrence counts
Second one will use a binary search tree.
Calculate the time it took your program to process the data
The calculation must at least include reading from the file and creation of the data structure.
You can also include printing and freeing.
Results must be comparable – if one program includes freeing in the time calculation, so must the other.
Print the number of unique combinations your program found
Note: Since we are performance-focused, use a global variable for the counter in the trees task
Your focus should be time efficiency. Leave the name variable in the struct statically declared – e.g.
char combinedName[64]
Program 1: Linear search
To start, open up the
example_linear.c and familiarize yourself with the code. Try to run it and look at the output.
Also notice how the program runs – it uses a command line argument for the name of the input file. E.g.
./program_name input_file_name
Alter the reading function to match the data file.
The current reading function assumes a single name (one word). The lab task data file contains 4 fields per row – eID, first name, last name and city.
Change the
fscanf() to account for the extra fields. You can immediately discard the eID and city – they won’t be needed.
Concatenate the name as
"Firstname Lastname" . Use the concatenated form both storing into the struct and comparing for the unique name. Hint:sprintf() is the easiest function for this.
Count the number of times a name exists
Add the counter variable to the struct. Either
int or
unsigned .
Before going forward, look at how the current code finds unique values and when it expands the array. Hint: look at the return value of
CheckNameUnique() function provided with the code – this will help you a lot.
Add the initialization for the counter you just created into the reading function if it’s a new name (not known before).
Add the update to the counter if a name already exists in our database.
Alter the print function.
Alter the printing statements to print to the output file.
Print out the number of times each name occurred.
Add printing of the number of names you got from the file
Add freeing of the struct
Add the timer
Add two
clock variables.
Place the two points when you wish to measure the time. E.g. start before reading function, end after freeing function.
Calculate and print out the total time taken. Print the time in the terminal!
Test your code without valgrind first, then with valgrind.
Run with valgrind. When running with valgrind, you need to write arguments to valgrind before the executable and arguments for your program in the end. Running with valgrind will take approximately 1 minute to complete!
E.g.
valgrind./example_linear random_people_city_data.txt
Program 2: Binary search tree
To start, open up the
example_bin_search_tree.c and familiarize yourself with the code. Try to run it and look at the output.
Also notice how the program runs – it uses a command line argument for the name of the input file. E.g.
./program_name input_file_name
Alter the struct. You need to add the string and counter, just as with the linear search.
Alter the reading function similarly to what we did before
Change the fscanf() statement to read 4 fields, you will only need the first and last name fields.
Concatenate the name so you can store it in a single variable.
Update the insert function
Change the parameter to be a string (the full name you wish to store into the tree).
Change the comparison operations. Remember, besides the two given in the sample code, you also have to account for the situation that the name already exists! In that case, you need to update the counter.
Update the
CreateNode() function
Change the parameter so that it can take in a name
Add an initialization to the counter
Update the
PrintTree() function
Make updates related to the new data structure we have
Note: because we are time optimizing the solution, you can make the
FILE * pointer as a global variable. This is again an acceptable use case for the scenario we have. Alternatively, you can pass it to the function.
The sample print function is given as “pre-order” traversal. Change the traversal type to such that the tree would be printed in the alphabetical order. For this you need to change the order of the action done to the current node and the recursive calls to the child nodes.
Implement a free function for the data structure
Just as any tree function, this also has to be a recursive function. Take inspiration from the
PrintTree() function for how to traverse nodes. NB! You need to pick the correct traversal (pre-, in- or post-order) for the freeing to be correctly performed!
Add the timer
Run your code through valgrind. You can also use command line arguments with valgrind. NB!
E.g.
valgrind./example_bin_search_tree random_people_city_data.txt
Compare time
Now compare both of the programs on runtime. Which takes longer, why?
Also think about how the results would change, if
You would change from (n + 1) to (n * 2)
realloc() strategy for linear search.
The amount of unique names would increase. How badly would either of the programs react to it. Try to express it in terms of the Big O notation (https://www.bigocheatsheet.com)
The file would contain more names, but the amount of unique names remained the same (just higher counts).
If you did the advanced, compare the binary search tree and trie. How would adding more unique names to the program affect it compared to binary search tree – any key advantages?
Testing
The following samples include the start of the results for each of the programs you will create. First two are within the base task, last on is the advanced task.
The total number of unique name combinations should be 11440.
Note: I’m only showing the order of results for you to validate your program. I’m not showing the output of the program, which is the time it took to perform the task and the number of unique combinations it found.
Linear search
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Ingrid Purga 14
Angela Leppik 13
Hendri Meier 21
Mihhail Saal 18
Aivar Tiidus 14
Julia Leppik 15
Ingrid Kaasik 18
Marko Lepik 14
Ants Ratas 26
Heimar Tomson 12
Yana Kuningas 10
Aivar Kurg 17
Edgar Johanson 14
Eliise Saks 14
Kristi Saar 18
...
Binary search tree / Trie
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Ain Aas 15
Ain Aasa 19
Ain Allik 11
Ain Anvelt 18
Ain Hein 17
Ain Heinsoo 20
Ain Helme 16
Ain Herkel 17
Ain Hunt 13
Ain Ilves 15
Ain Ivanov 12
Ain Jakobson 13
Ain Johanson 16
Ain Kaasik 20
Ain Kalda 9
...
Advanced task: Trie implementation
Create a third program to compare the efficiency. Use the trie data structure.
All the other requirements remain the same.
Recommended data structure
Compared to the sample on the slides, we don’t use a boolean true/false value to denote the end of a string (e.g.
isLeaf ). Instead, we use a counter. If the counter is 0, we consider it not to be a leaf node, if the counter is > 0, it’s a leaf node where a name ends.
1
2
3
4
5
6
typedefstructnode
{
charch;
intcount;
structnode *chars[ALPHA_LEN];
}trie_t;
Hint 1: your alphabet length should also include positions for a space and a minus symbol, which are present in the names. No other special symbols are used in the file.
Hint 2: Initialization in the
CreateNode() is crucial for both the counter and the pointer array.
Create a menu program that allows for manipulation of a singly linked list
New nodes must be added to the beginning of the list only.
A node contains a number (id) and a string (name).
id is auto-generated, unique, sequential integer starting from 0
Name is a string entered by the user during creation of the node.
Name is allocated dynamically, dependent on the length of the entered name.
Implement the
PrintNode() ,
CreateNode() ,
InsertNode() ,
FindNodeByID() ,
FindNodeByName() and
Unload() functions based on their description
Use the starter code provided as your base
Work through the task base on the step-by-step guide. Implement the functions as specified in the guide.
Step-by-step guide
NB! Notice that the structure declaration creates 2 valid data types –
struct node and
list . While both are the same thing by definition (they are aliases from typedef), one is used in the context for referring to a single member and the other is referring to the entire list.
1
voidPrintNode(list*pNode);
We will start by creating a function that prints out the contents of a single node. Add this function to the code file (prototype already exists in the header). It needs to print all members of the node.
PrintNode() function should be used whenever you need to print the contents of a node. This includes Find family of functions that you will create later. This way, whenever the contents of the node changes, the modifications for printing only need to be made once in a single place.
Don’t forget to check for NULL pointer to avoid crashing if the calling function decides to pass a NULL pointer.
Uncomment testing stage 1 and run your code!
NB! The testing strings are in UTF-8, which is fine for our Linux machines, however they could disagree with some Windows environments. Alter the strings if necessary.
NB! This function is already completed for you. Pay attention to how the loop is built. You need it for traversal in the later functions.
1
voidPrintList(list*pHead);
This function is used to print the entire linked list. It gets passed the pointer to the beginning of the list and it will traverse the list until it reaches the end of the list, indicated by a
NULL pointer..
The function also handles a
NULL pointer passed to it instead of a valid pointer to the beginning of the list – this will happen when the linked list is empty!
The function itself does not handle printing of the nodes, but rather calls
PrintNode() on every node to print its contents. This is why we completed that first.
This function is used to create a new node and initialize its values. It returns the newly created node. Values for the node are passed as parameters.
Create and allocate the node
Assign the ID value (sequentially incrementing value, use
static )
Allocate memory for the name and copy it over
Initialize the
pNext pointer
Return the struct pointer
NB! There are two memory allocations in this function. Both need to be checked. If there’s an error, return a NULL pointer! Program should continue running.
Uncomment testing stage 2 and run your code! Only expected changes are different address space (going from stack to heap) and increased allocations when checking with valgrind.
This function is what’s also called a destructor. It removes (deallocates) all nodes from them memory. This includes all the members that were allocated dynamically as well – in our case, the name.
To free a linked list, you need to loop through the entire list. To familiarize yourself with the traversal of the list, check the
PrintList() function.
When freeing the list, you should pay close attention to how you free the current node. After freeing the current node, we no longer have access to its
pNext pointer. Due to this, we need to create a temporary pointer that will have a copy of the current node’s address, which adds a little bit of complexity.
You have to complete 3 steps in the loop. The most elegant way of making it is by using a
while() loop.
Make a copy of the current node address (you will be freeing through this copy)
Move the main pointer to the next element (
pCurrent = pCurrent->pNext )
Free the current node and its members.
Uncomment testing stage 4. Run through valgrind. You should see 6 allocations and 6 frees.
==17208== All heap blocks were freed -- no leaks are possible
==17208==
==17208== For lists of detected and suppressed errors, rerun with: -s
==17208== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
1
structnode*FindNodeByName(list*pHead,char*name);
1
structnode*FindNodeByID(list*pHead,intid);
In this part, we will create search functions to either search by ID or by name. The functions return the pointer to the node if a match is found or a NULL-pointer when the element was not found.
Use
PrintNode() to output the result.
Uncomment testing stage 5 and check the output! Note that the error message is provided by PrintNode() function and is not specific to the search!
Congratulations! All base functionality is now finished! Now create the additional functions, prompts etc so that the program could be used through the menu as a command line utility. You can delete the testing code from main.
This means prompting the user for name when inserting a node or asking for search keywords, as well as checking the returns for functions as necessary. Once that is done, present the task.
Testing
This is the final expected output of the base task
==17678== All heap blocks were freed -- no leaks are possible
==17678==
==17678== For lists of detected and suppressed errors, rerun with: -s
==17678== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Advanced task 1: Alphabetically ordered list
In this task, we’re alter modify our list in order for it to be always sorted alphabetically by the name.
Requirements
Update the
InsertNode() function in order for it to add the new node in an alphabetically sorted order.
Must handle adding to the beginning, end and middle of the list.
Update the
FindNodeByName() function to take advantage of the list being sorted.
If you completed advanced task 2 first, also update the approriate
Remove() functions to also take advantage of this property
Notes
Adding to the beginning, end or middle of the list requires different handling of the pointers.
Take note of the positioning and choose appropriate method of adding to the list
Advanced task 2: Removal of nodes
Implement the two functions to remove nodes from the list.
Requirements
Implement
RemoveNodeByName() that removes a node from the list if it finds a matching name.
Implement
RemoveNodeByID() that removes a node from the list if it finds the matching ID value.
Take advantage of the ordered nature of the list (from advanced task 1)
Ensure the integrity of the list after removing an element
Notes
Find if the node exist in the list that you need to remove
Removing from a singly linked list will require you to keep track of the previous node location. You cannot reuse
FindNodeBy() function (You could, if it was a doubly linked list).
Removing from the beginning, end or middle of the list are different. Detect the position and then decide on the removal procedure!
Don’t forget to free the memory for the node completely
Double pointer is required to remove from the beginning of the list.
In this lab there are two base tasks, one completely independent advanced task for recursion and one advanced task as an extension to the stack task.
Task 1: Set of recursion tasks
As the first task, you need to create three recursive programs (parts 1, 2 and 3). The task can only be presented once all three are done!
NB! The recursions are extremely simple, avoid looking for solutions on the internet!
In the beginning of the lesson, we’ll create an example of a factorial task solved recursively!
Requirements
You can make the solution as one file with three recursions or as three separate programs. Examples are provided as separate programs.
In all cases, the programs must support positive integers as input. Negative input either has a defined base case or will cause an error.
Input can be asked inside of the application or read as a command line argument.
Iterative solutions are not permitted! All three parts must be solved recursively
Only present your task once all three parts are completed.
Part 1: Sum
Create an application that finds the sum of all positive integers until the entered integer. Negative argument for the recursive function returns the sum as 0 (no positive numbers to add, base case)
Create an application that finds the n-th Fibonacci number. The fibonacci numbers are calculated as the sum of two previous fibonacci numbers fibx = fibx – 1 + fibx – 2. Compared to the previous recursions, Fibonacci sequence requires two base cases.
Did you notice how long it took to find the last value? This is expected in the lab and you don’t need to fix anything, however you should understand why this happened! If not, think or ask!
All 3 parts are done, now present your lab!
You can time the execution of your program using the
time program. I’ve put the comparison of two different solutions for Fibonacci below. The second one uses tail-recursion. You can read about it on your own if necessary!
1
2
3
4
5
6
7
8
9
10
11
12
risto@risto-tux:~/Nextcloud/prog2/wk10_recursion_stack$ time ./fib 45
In this task, we will create a menu program to demonstrate some of the stack functions. The lab task is demonstrative only and is intended to support the dynamic memory allocation studied before – due to the added complexity of this, the benefits of speed are unfortunately diminished.
In the beginning of the lesson, we will create a partial solution with the Push() function.
Requirements
Create a menu structure – the user must be able to select the desired operation.
Create Pop() and Peek() functions
Pop() removes the topmost element from the stack and returns the value.
Peek() returns the topmost element from the stack, but does not remove it.
The functions you create are not allowed to display the values themselves! Print the values on the screen once they are returned back to the menu!
NB! In the base variant, you are allowed to reserve a special value to be returned in the case of stack underflow, so that the illegal value wouldn’t be displayed. This is not allowed in the advanced version of this task.
Depending on the function, implement either underflow or overflow detection!
Pay attention to the situation when all the elements are removed from the stack – do not use realloc for 0 bytes, use the free() function!
Debugging function for the stack
In order to get a bit better understanding of the stack contents, we’ll provide a debugging function. In reality, this kind of function goes against the limitations of the stack and may also not even be possible due to the implementation specifics!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
voiddebug_stack(stack st)
{
printf("DEBUG\n");
printf("\tStack size: %d\n",st.size);
printf("\tData pointer: %p\n",st.data);
if(st.size!=0&&st.data==NULL)
{
printf("Sanity check failed!\n");
printf("Stack is NULL, but size is not 0\n");
return;
}
if(st.size==0&&st.data!=NULL)
{
printf("Sanity check failed!\n");
printf("Stack data is not NULL, but size is 0\n");
return;
}
printf("\tValues: ");
for(inti=st.size-1;i>=0;i--)
{
printf("%d ",st.data[i]);
}
printf("\n\n");
}
Testing
During testing, pay extra attention to
stack overflow
stack underflow
freeing the data after last element is popped.
memory leak when the program exits when there are still values in the stack
This advanced task is given to you as a description of an algorithm. This is a completely independent task, that is not related to the original lab task solution!
NB! The variables in the description are given more to the likes of mathematicians (single letters). Use descriptive variable names when creating your program as is more suited for programming.
Given search key
K and an array
A of
n integers A0, A1, … , An – 1 sorted such that A0 ≤ A1≤ … ≤ An – 1 use the following algorithm to find if
K ∈ A
Set lowIndexL to
and highIndex H to
n − 1 .
Call the recursive function using 4 arguments –
A, L, H, K .
If
L > H , the search terminates as unsuccessful – return 0
Set
m (the position of the middle element) to the
floor of (L + H) / 2
If Am < K, set
L to
m + 1 , call recursion again
If Am > K, set
H to
m − 1 , call recursion again
If Am = K, the search was successful – return 1
Once the task is complete, prepare the answer to the following questions. Once ready, present your task!
Time complexity – compare the given algorithm and conventional linear search. How many comparisons do you need to find the result from an array that has
This lab has one task that is extended by 2 advanced tasks
Lab task: Reading a file dynamically
The purpose of this task is to introduce how to read an unknown length file and allocate the structure array for it dynamically in the exact length required.
Data file
Data file structure for the task:
<index> <last name> <first name> <curriculum code> <points>
Index (integer)
First and last name – strings with varying length
Curriculum code – string with exactly 4 characters
Points – floating point value from 10.0 to 30.0 with a precision of 0.1.
You can use the data file generator from last week for this. To test, we’ve also provided files so you can compare your output.
Use realloc() to read the data, expanding your memory as you are reading the file
Store the data into a dynamically created structure array
Once the reading is finished, you allocated memory size must match the exactly to how many were read from the file
You can only read the file once (repeat reading is forbidden, this includes just checking for newline count!)
The length of the file can change – the exact length is determined during reading.
All variable length strings must be stored in memory exactly (using dynamic memory allocation).
During reading, use a static buffer and then use dynamic memory to store it into the struct.
The file contains students competing for a scholarship. Scholarships are given
For 7 best students (highest points count) for the following curriculums: IACB, EARB, MVEB
Print the list of students who will receive the scholarship and also print the number of students who got it from each curriculum.
Make sure there are no memory leaks!
Data structure for the task
For this task, we will transition into using dynamic memory for variable length strings (and other arrays) inside of our structure in order to save memory. The struct will now take the following form:
1
2
3
4
5
6
7
8
typedefstruct
{
intindex;
char*fName;
char*lName;
charcurriculum[LEN_CURRICULUM+1];
floatpoints;
}person;
Note:
You can alter the naming to your liking.
The fName and lName are now pointers that need dynamic memory allocation due to the length being variable from person to person.
Array for curriculum stayed as static because it never changes – using dynamic memory here would be a waste.
Single ints, floats… remain static because again, using dynamic for those members would be making the program purposefully slower and more complex.
Recommended list of functions
NB! The actual form the functions take depends on how you tackle the task.
At minimum, you need three functions:
Reading the data (check the next subheading).
Printing the results.
1
2
voidPrintScholarships(person*pStudents,intn);// Good for base task
voidPrintScholarships(student_wrapper stdWrapper);// Good for advanced task 2
Freeing the memory.
1
2
3
voidFreeStudentData(person*pStudents,intn);// Good for base task
voidFreeStudentData(person**ppStudents,intn);// Good for base task with defensive programming
voidFreeStudentData(student_wrapper stdWrapper);// Good for advanced task 2
Recommended functions to make your life easier
qsordi compare function
1
intComparPersonByPoints(constvoid*a,constvoid*b);
Printing the data of a single student (useful for printing the students who get the scholarship).
1
voidPrintStudent(student*s);
For defensive programming, the function shown on the slides
1
voidFreeMemory(void**p);
To make sure everything got into the memory fine, you might want to have a function that just prints everyone data. This is however not a part of the task.
Updates to reading a file function
From now, the reading function takes a new form. We need to get 2 values from the reading function – location of the array and number of lines. I’m proposing a three options for this different function.
First variant will return the number of lines and passes the location of the memory as a double pointer.
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/**
* Description: Reads data from a file. During reading, a dynamic
* struct array will be created and expanded using realloc()
* Parameters: ppStudentData - Stores the location of the allocated array
* fileName - name of the input file to read
* Return: Number of lines read from the file
*/
intReadData(person**ppStudentData,char*fileName)
{
// Current line counter
intcount=0;
// Main pointer for the allocated array
person*pData=NULL;
// Temporary pointer used for reallocating the array
person*pTemp=NULL;
// TODO: Temporary buffers for reading (all variables you intend to read!)
// Read a record at a time in a loop into buffer(s)
while()
{
// rellocate memory to fit the latest line
// Check allocation was successful
// Make both pointers point at the memory location
// Allocate memory for the struct members fName and lName, check allocation!
// Copy data in from the buffers into the struct array
// Increment number of records successfully read
count++;
}
// Store the allocated array trough the double pointer
*ppStudentData=pData;
// Return the number of lines read
returncount;
}
Second variant returns the pointer to the allocated array and passes the line number through a pointer.
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/**
* Description: Reads data from a file. During this, a dynamic
* struct array will be created and expanded using realloc()
* Parameters: pLineCount - pointer to store the read line count
* fileName - name of the input file to read
* Return: Pointer to the allocated data array
*/
person*ReadData(int*pLineCount,char*fileName)
{
// Current line counter
intcount=0;
// Main pointer for the allocated array
person*pData=NULL;
// Temporary pointer used for reallocating the array
person*pTemp=NULL;
// TODO: Temporary buffers for reading (all variables you intend to read!)
// Read a record at a time in a loop into buffer(s)
while()
{
// rellocate memory to fit the latest line
// Check allocation was successful
// Make both pointers point at the memory location
// Allocate memory for the struct members fName and lName, check allocation!
// Copy data in from the buffers into the struct array
// Increment number of records successfully read
count++;
}
// Store the number of lines trough the pointer
*pLineCount=count;
// Return the pointer to the data
returnpData;
}
I’m also proposing a third option. The benefit in this one is that the function can convey information about the status of the reading to the caller (e.g. successful read, error reading). If you want, you can add more codes to be more precise on the errors.
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
/**
* Description: Read data from a file. During this, a dynamic
* struct array will be created and expanded using realloc()
* Parameters: ppStudentData - Stores the location of the allocated array
* pLineCount - pointer to store the read line count
==14397== All heap blocks were freed -- no leaks are possible
==14397==
==14397== For lists of detected and suppressed errors, rerun with: -s
==14397== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Advanced task 1: Optimal algorithms for allocation
In this task we’re changing the algorithm how we are allocating memory to a more reasonable one.
Requirements
Change the file reading in a way that you would be using the (n * 2) expansion strategy
Every time you run out of memory, you reallocate to a 2x higher amount that you have right now.
To start, pick an initial allocation that is greater than 1 (pick a size that would be 3x lower than the file length just so you can observe the correctness).
Advanced task 2: wrapper struct
In this task, we’re making our code a bit more readable and contained. We will put both the struct and its properties into a wrapper to keep everything tightly integrated.
Requirements
Put your structure array pointer into the other struct (wrapper)
In your wrapper, you should have 3 variables – pointer to the struct, number of structs allocated, number of structs used
Change your function parameters for the existing functions to use the new wrapper struct
NB! Think through for which functions it is required to pass a pointer to the wrapper where it is unnecessary.
Hint: Now you will need an extra access layer to get from the wrapper to the struct itself. To help with readability, you might want to “unpack” those variables when you need to use them. See the following example.
1
2
3
4
5
6
7
8
9
10
voidPrintData(student_wrapper stdWrap)
{
intlen=stdWrap.used;
for(inti=0;i<len;i++)
{
student*s=&stdWrap.studentDB[i];
}
}
After this class, you should
know different ways to structure your code in order to read files that can be of any length
Know how to dynamically alter your array size
Know how to read data into an expanding array using dynamic memory allocation
know all possible ways realloc works
know the difference between (n + 1) and (n * 2) allocation strategies and be able to implemetn at least one of the two
be able to allocate memory to members of structures
know about defensive programming idea for freeing memory
Know all the steps strdup() does in the background and be able to use it.
This week we have one lab task that is extended by two advanced tasks.
Task 1: random data file generator
This week, we will create a program that can generate random data files. These are useful for testing other applications without having access to real data.
Ask the user for how many generated entries they need. There must be no upper bound!
All entries are stored in a struct array during generation. Struct array itself is generated using dynamic memory allocation.
Pick random first and last name and curriculum code from the pools given to you.
Generate admission points
Admission points are in range from 10 to 30, ends inclusive. Precision is 0.1 points (e.g. 24.7)
Sort the structures based on the generated last name. If last names for two entries match, order them by their first name.
Write the output in the following format: <index> <last name> <first name> <curriculum code> <admission points>
Index is a unique integer. First entry will have the index as 0, every following one is incremented by one.
Make sure that all resources (including memory) is freed when the program exits, use valgrind to make sure.
Qsord comparison function
I’m proposing two different comparison function options. In the first case, the type cast will be done as needed, removing the need for additional variables.
In the second case, we will create temporary pointers that take up some memory, but improve the readability of the code.
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
intComparCandidate(constvoid*a,constvoid*b)
{
// Temporary cast pointers for readability
candidate*pA=(candidate*)a;
candidate*pB=(candidate*)b;
// Get comparison for last names
intret=strcmp(a->lName,b->lName);
// When last names match, return difference by first name
if(ret==0)
returnstrcmp(a->fName,b->fName);
// return difference by last name
returnret;
}
Workflow
Create the necessary structure declaration to keep the data
Ask the user how many entries to generate
Allocate the memory, check the allocation!
Generate the necessary entries
For every person, you must generate the entries randomly.
For every field from the pools, you must create a random number to pick out the member Hint: you can either copy the name to your struct or only keep a pointer to the name
In addition, you need to generate the admission points.
Hint: Think about this mathematically – e.g. what’s the difference between 30 and 300!?! rand() function will always give you an int, regardless if what you try to do to it.
Sort the structure array
Write the results to an output file
Free the allocated memory
Check with valgrind for correctness! Not only in the end, but also if you encounter some weirdness, crashes or corruption!
Testing
The output of your application should be relatively simple and short.
==12389== All heap blocks were freed -- no leaks are possible
==12389==
==12389== For lists of detected and suppressed errors, rerun with: -s
==12389== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
When looking out the output file, we see that everything is sorted by name (showing only first 15 lines)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
0 Aas Heidy IACB 27.9
1 Aas Juhan IACB 29.6
2 Aas Reet IACB 18.2
3 Aasa Anne EARB 10.1
4 Aasa Julia MVEB 13.8
5 Aasa Sven EARB 14.5
6 Aasa Taavi MVEB 16.1
7 Aasa Urve EARB 29.4
8 Aasa Valdo IACB 21.2
9 Allik Ivari IACB 21.3
10 Allik Kerttu MVEB 29.8
11 Allik Kerttu MVEB 21.4
12 Allik Laivi MVEB 24.2
13 Allik Peeter IACB 22.3
14 Allik Rainer IACB 17.5
Advanced task 1: output file formats
In this task, we’ll add CSV as a secondary output format for our application. User must be able to choose which output format they want.
Requirements
Add ability to your program to generate data in the CSV format
First line of the CSV must be the header with the field names.
This will be followed by data lines. Each member is separated by a comma. NB! In CSV, we don’t put a space after the comma!
Ask the user in which format they wish the file to be generated in (CSV or space-delimitted) and generate the appropriate data file.
For the space-delimitted file, the extension must be
.txt and for the CSV file, use the extension
.csv .
Make sure that the CSV is correctly generated – try to open (or import) it using Libreoffice Calc’i or Microsoft Office and see if they recognize the format correctly.
Advanced task 2: settings
In this task, we’ll make our generation a bit more flexible and add settings to it.
Requirements
All settings must be kept in a structure – create a new structure declaration for this!
All settings must have default values
Program must use the defaults if the user does not wish to change the settings. What the user needs to do to alter the settings is up to the developer E.g. the program could ask if they wanted to change them as the first thing it does or the user can use a specific command like argument, that allows settings to be edited.
User must be able to alter the following settings, while inside of the program.
Which data fields are generated (it must be possible to turn each one of them on or off).
Name of the output file (only the name part. Extension must be chosen automatically based on the output format!)
Output format (this is from advanced task 1, move the location of the setting into the struct).
Number of items generated (base task part, move the location of the setting to the struct)
Lowe rand upper bounds for the admission points
NB! The number of entries generated must be asked regardless if the user wished to change the settings or not. Purpose of this is just to keep all settings neatly in one struct.
User must be shown the settings the generation will be performed under regardless if they chose to edit it or not.
Note: if you wish, you can make a settings file, however it is not necessary. The updates to settings do not need to be remembered between executions. However if you do include a settings file, make sure that the user can change those settings within the program without editing the file manually.
Hint: You can make nice use of the trinary operator here printf("First name: %10s\n",settings.genFirstName?"Yes":"No");
After this class, you should
Know how to use dynamic memory allocation
Know how to check for memory leaks.
Know in which situations it is reasonable to use dynamic memory and in which situations you should not.
Know the difference between function call stack and heap.
In this lab you have one task, which is divided into two parts. The task is extended by two advanced tasks.
NB! Even though you can develop on any platform you want, one part of the proof will be to execute your program through Valgrind. Valgrind is known to work well on Linux and is known to be available on MacOS, however not on Windows. If you wish to develop on Windows, you need to transfer the file to a Linux environment when defending.
The archive contains the files for both the base and advanced task!
Task part 1 / 2: Reading data and processing
In the first part of the task, we need to read the data, process and print it. You will also need to demonstrate your use of compiling using a Makefile and testing memory errors using valgrind.
Requirements
Read data from the input file
Data file structure <õppeaine nimi> <hinnete arv> <hinded>
Data must be stored in a structure array. For grades, you can create a reasonably sized array in the struct – e.g. 20 grades (based on the data file for the task)
Print the following information
Name of the subject
Grades given in the subject
Average grade for the subject
Keep your reading, calculating average and output in separate functions (i.e. all called from main).
Code must be divided into two code files, which both have their own header files. Division must make sense, but details are up to you.
Code must be compiled using a Makefile
At minimum, Makefile must describe the recipe
all , have a variable
CFLAGS and use the following flags for compilation:
-Wall -Wextra -Wconversion -g
Other aspects of the complexity and structure of the Makefile are up to you!
During the defence
Show compilation using a Makefile
Show the output through valgrind to prove that there are no issues.
Hints
The number of grades for each subject can be different, which means that every line in the file might have a different length. Due to this, we cannot read the data file with just a single
fscanf() function call like shown on previous weeks.
Due to this, we need to perform the reading of the data using nested loops
The outer loop will read the name of the subject and the number of grades in that subject
The inner loop will use the number of grades obtained from the outer loop to specify the number of iterations. Do not forget to check the return value of
fscanf() of the inner loop
Testing
This is an example of the code structure that you could have.
NB! Names of files are not determined, it’s just a sample. Also the data file can be in the same folder as the code files.
1
2
3
4
5
6
7
8
9
.
├── data
│ ├── data_grades.csv
│ └── data_grades.txt
├── analyzer.c
├── analyzer.h
├── Makefile
├── subjects_processor.c
├── subjects_processor.h
NB! The following output exampel doesn’t show valgrind output. Make sure to test using valgrind as well!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Subject: Programming
Grades: 4 5 5 4 2 5 1 4 2 5
Average: 3.70
Subject: Databases
Grades: 5 3 3 4 4 3
Average: 3.67
Subject: Mechatronics
Grades: 3 3 4 4 5 0
Average: 3.17
Subject: Physics
Grades: 5 5 3 3 2 3 4 5 1 1 2 4
Average: 3.17
Subject: Ethics
Grades: 5 5 5 4 5 3
Average: 4.50
Subject: Scientology
Grades: N/A
Average: N/A
Subject: Chemistry
Grades: 4 4 3 4 5 4 5
Average: 4.14
Task part 2 / 2: Logging
In this part of the task, you need to add logging to your program.
Requirements
Add logging for the program
One log per line. Logs cannot be on two lines!
Every log starts with a timestamp. Timestamp must contain the date (day, month, year) and the time (hours, minutes, seconds)
When executing the program again, the logs must not be erased from the previous executions.
Events to log
Opening and closing of the input file
Reading of the input file completed. Print the number of lines read.
Reading of the input file aborted due to reaching the limit of the array. Print the max size in the log. NB! You have two arrays – struct array and grades array in each struct!
Error calculating average grade.
Integrity of the log must be ensured even if the program crashes
Option 1: You can force writing of the output buffer to the file using
fflush()
Option 2:Close the log file after each write – closing of the file writes the buffer to the file.
If the log file fails to open, program must continue its work (and not exit!). Notify the user of failure.
Hints
Prepare the string before calling the logger function! Do not overwhelm the logging function with parameters, you want it to be simple to call. To prepare the string, a good option would be to use
snprintf()
To simplify the logger function, it is recommended to to pass it the name of the file nor the pointer to the log file. Some possible options:
Open and close the file in the log function
This is also one place where are global variable would not be out of place. If you wish, you can create the file pointer as a global variable (not a good solution, but an acceptable one). However an even better solution would be to write the advanced task.
I.e. calling the logger function might something like this: Logger("Program started");
Advanced task 1: logging library
In this task, you will create a logging library. Hint: you can use it for homework 2.
Requirements
Add a set of .c and .g files and move your logging code there
Add 4 levels for logging: OFF, ERROR, WARNING ja INFO – these are enums!
On each line of the log file, print the logging level of the message
Log should be only written to the file if the set logging level permits it.
I.e.
INFO level means that all messages will be logged;
WARNING level means that only messages with the logging level WARNING and ERROR are logged;
ERROR level means that only messages with the logging level ERROR are logged
OFF means logs are not produced
To call logger, you now needs to include the level of the log (ERROR, WARNING või INFO)
E.g.
Logger(LOG_INFO, "Program started");
Your library must support giving a specific name to the log file
E.g.
LoggerSetOutputName("log.txt");
Your library must support setting the logging level
E.g.
LoggerSetLoggingLevel(LOG_INFO);
The settings for the log must be kept internally in the logging library. Settings include the log file name and logging level (you can add additional ones if you want). Keep these as global variables in your logging library.
Settings for the logger should have default values – e.g. if developer forgets or does not want to set them up, then defaults would be used and program wouldn’t crash.
All logs must be timestamped as described in the base task.
Advanced task 2: Reading simplified CSV
In this task our purpose is to show a simplified way to read names that consist of multiple words.
Requirements
Use the simple CSV version of the input file
Data fields are separated from each other by a comma, data fields do not contain commas.
Walkthrough
NB! We are using a simplified version of CSV files and reading then. You cannot read any CSV file in this manner! If a fields would contain a comma, this method of reading could not be used!
If you need complete CSV support, either use a library (e.g. libcsv) or write your own that could be based on a finite state machine (FSM) – you read a character and then depending on the current state and that character, you choose what to do.
scanf() family of function supports describing simple regular expressions for reading and describing data.
E.g. to read a time, we can use the format
scanf("%d:%d",&hours,&minutes); – with this format, the user can enter the time as
14:35 . Variable
hours would get
14 as a value and
minutes would get
35 . However, if the user chose to enter
14 35 , it would not match the format –
hours would still get
14 , because we didn’t expect a space in the format, the reading stops and
minutes will be left uninitialized.
Based on this knowledge, we can compose a format
fscanf(fp, "%[^,],%d", ...) . Replace the file pinter with your own and instead of three dots, put your variables where to store the data.
First field will be read as a string, until the comma (comma is not read).
Then the comma will be read out of the buffer (and thrown away)
Then, a single integer will be read (number of grades in the subject).
This method allows you to read the first two data fields. The rest of the format for getting the grades you must make on your own (don’t forget the commas!).
After this class, you should
be able to compose a simple Makefile
Know how to declare a variable and use one
Know how to write a recipe, including multiple recipes in a file
Know what goes into a recipe
Know about implicit rules for Makefiles
be able to compile code using Makefiles
Know to to use valgrind and read its output to test your programs for bugs
Know how to get the current time inside of a program and be able to format the time string
First task consist of two separately graded parts plus an additional advanced task.
Second task consist of a base task and an advanced task.
Task 1: Quicksort
This task is based on practicing using the
qsort() function to sort both basic data types and structs. During this, we will also start dividing our code into multiple code files and you can also start making your first header file. The starter code already has file structure given for you.
NB! The base code contains code for both parts of the base task and the advanced task as one!
Requirements
Implement the functions that are described in the header files required for the specific part of the task (look below for parts description). Implementation must be written in the .c file with the same name.
Call the required functions to sort and display the results from the switch statement in the main function
Compile the code from multiple files as was given by the structure in the starter code. Use command line to do so (or Makefile if you already know how).
Graded parts of the task
For part one of the base task, you must have menu options 1 and 2 completed. For this you must
Implement the qsort comparison function for arrays made out of integers and real numbers
Call out the qsort function with the comparison function made before as well as the print function to prove completion of the task.
For part two of the base task, you must have menu options 3 and 4 completed. For this you must
Implement the qsort comparison function for arrays made out of structures. One of them must compare the first name, the other one the employment length.
Implement the function to print out the structure array
Call out the qsort function with the comparison function made before as well as the print function to prove completion of the task.
For the advanced task, you must have menu option 5 completed. For this you must
Implement the qsort comparison functions to compare the structure array members based on their last name. If the last names are the same, you must use first name as an additional criteria for sorting the structures.
Call out the qsort function with the comparison function made before as well as the print function to prove completion of the task.
Testing
The following example includes all three graded parts (part 1 and 2 as base tasks and the advanced task).
In this task, we’ll introduce you how to include and compile programs using external libraries. We will use the libcurl library, which can be used to access various online APIs, download files etc. We’ll use this to download Estonian governments opendata. We’ll also proceed to practice dividing code into multiple files.
NB! To simplify your life, solve this task on a UNIX based operating system (e.g. Linux). It can be done on Windows, but the additional setup for this is more exhaustive and you need to change a few parts of the given source code.
Display the top 10 days with the highest confirmed cases count.
At minimum, your code needs to be separated into three code and header files
file_helper.c and
file_helper.h contain the data download and preparation (given in the starter code, no need to modify)
data_processor.c and
data_processor.h contain the reading and processing of the data
main.c and
main.h have controls for the program flow and generic macros.
If you want, you can separate out reading of the file and processing the data to additional files.
Compile all the code files together either from the command line or by using a Makefile.
Recommended workflow
Get acquainted with the structure and the contents of the code files.
Comment in the preferred download format (either space or comma separated data).
Compile and see if the data download works. To compile, you need to add an additional flag
-lcurl
Add a description of the data structure and the necessary macros for sizes to the
main.h header file. Hint: You don’t need to store all of the data from the file in the structures. Pick only the data you need.
Write the reading function to read data from the file into memory into
data_process.c . Add the prototype to that function into
data_process.h header file and call the function in the main() function. NB! Did you notice the contents of the first line of the file?
Hint: To ignore a certain field in the file, you can use the format
%*s in the scanf function. Asterisk notes that the field matching this format will not be stored. In this case, any string until a space or a newline is encountered will be ignored. When you use an asterisk in the format, you also don’t need to provide a pointer where to store the data (scanf parameter). Compile and test if your data is successfully read into the structure array!
Create a function to print the data. Call it to print the last 14 days of data. Hint: Use your knowledge of arrays and poiinter-arithmetic that we used in the first week (task 1, part 2)! This way you will get a simple printing function that you can reuse in the next step.
Create a comparison function for
qsort to sort by the confirmed case count and and call
qsort to sort the data. Display the top 10 days based on confirmed case count. Hint: If you used pointer-arithmetic to call the printing function in the last step and left your print function to be as simple as possible, you can reuse it to print the top 10!
Testing
NB! The output contains both the base task and the advanced task!
Warning! The file opendata_covid19_tests_total.txt already exists!
Are you sure you wish to download again?
Enter [Y] or [N]
n
Last 14 days
Date Cases
1. 2023-02-07 48
2. 2023-02-08 55
3. 2023-02-09 54
4. 2023-02-10 52
5. 2023-02-11 21
6. 2023-02-12 28
7. 2023-02-13 76
8. 2023-02-14 37
9. 2023-02-15 47
10. 2023-02-16 67
11. 2023-02-17 48
12. 2023-02-18 24
13. 2023-02-19 23
14. 2023-02-20 53
Top 10 days
Date Cases
1. 2022-02-15 8467
2. 2022-02-03 8045
3. 2022-02-04 7849
4. 2022-02-01 7243
5. 2022-01-28 7171
6. 2022-02-05 7082
7. 2022-02-18 6996
8. 2022-02-02 6886
9. 2022-02-17 6751
10. 2022-02-08 6704
Date range: 2023-02-07 - 2023-02-13 total cases 334
Date range: 2023-02-14 - 2023-02-20 total cases 299
Case count decreased by 11.71%
Advanced task
Find the number of cases for the last 7-day period.
Find the number of cases for the 7-day period preceding it.
Print out the date ranges and infection counts for both periods.
Find and print if the case count as increased, decreased or remained the same from one period to the next. Also print out the change as a percentage, showing two places after the comma.
Hint: The array was sorted during the base task and cannot be used for the advanced task in a good way without resorting (which we should always avoid if possible). We can however prepare a partial copy of the original data, before we run qsort to sort it. To make a copy, check out the function
memcpy() – look into the parameters it requires to set the starting point of the copy, you can use pointer arithmetic to make a copy of only the relevant data.
MS Windows
This task is also solvable in MS Windows, however it is a bit more tedious, requiring a few additional steps. These steps have been tested in the spring of 2022! This assumes you installed the 64-bit version of the MinGW (e.g. through chocolatey – as given in the simple installation guide for MS Windows, using the command line). The steps are given as a transcript of the conversion from last year and are provided as-is!
1. Install libcurl on Windows. Uses chocolatey package manager
choco install curl
2. Make the library (dll file) available to the compiled program. Copy libcurl-x64.dll from libcurl installation directory to your program directory (assumes 64-bit compiler).
3. You need to download CA certificate so cURL can verify the certificate on the opendata website https://curl.se/ca/cacert.pem.
4. You need to specify the certificate authority file location to the code. Add this to the setup of the download in the file_helper.c, replacing the path\\to\\file with the path to the downloaded .pem file.
5. You need to rewrite the check if a file is already downloaded because that uses POSIX library available on UNIX systems – unistd.h. You should rewrite it with windows solution of that. Haven’t tested, but you’ll likely find the function BOOL FileExists(LPCTSTR szPath) in the library io.h. You may need to add additional Windows-specific libraries.
6. You can only download as CSV with the code provided. To download space delimited, you need to rewrite the ReplaceChars function. Easy option would be to open up a second file pointer for writing to a different file and remove the fseek, Simplest would be to do getchar -> putchar with a check in between for commas to be replaced with spaces.
7. You need to add -I and -L flags to your compiler command line arguments. Mine were:
Part 2 continues on part 1 with all the existing requirements staying in place and continuing through this part of the task.
Requirements
Find and print the taxation values for every person.
Print both the sum as a value and as a percentage to the annual income.
Write a warning if the tax percentage is 10% or greater than their annual income.
Write an error if there is no income
Remember! Make functions that only do one thing. Do not put calculations in the same function as reading the data or printing. Have a separate function that only does the tax calculations!
Hint: You should add additional member(s) to your structures to be able to store and display the required results! Structure members don’t need match exactly to the data fields in the file, adding or even processing and storing it differently in memory is OK.
Recommended functions
NB! The list here is just a recommendation. Your functions and their parameters can be different.