This is an in-class exercise. An exercise page like this one will contain a brief description but is intended to be supplemented by discussion during our meeting time. Complete the exercise to the best of your ability in the time given. Feel free to talk with other students as you work, and do not be afraid to ask questions if you get stuck. Aim to complete as much as possible during our meeting, and submit on Gradescope to check your solution. You are encouraged to work at home to complete what you do not get through today, and ask questions over Piazza or in office hours.
Reinforces concepts learned in today meeting:
- C++ basics: I/O using iostream library, how to compile/link/execute
- Using the C++ Standard Template Library (std::vector, std::map, std::sort) and iterators
- Complete the implementation of code which counts the number of occurrences of particular words in a file, and then reports the results in different orderings by utilizing
std::map
. - Compare your implementation number sorting from ex23 to the implementation provided by the Standard Template Library.
Part 1
Get started by running git pull
to update your clone of the public repository, and then copying the exercises/ex24
directory into your personal git repository. You’ll see that we have provided map-simple.cpp
, input.txt
, main.cpp
, sort.cpp
, and monkeys.txt
.
Part 2
The supplied file map-simple.cpp
is a completed program that takes in input pairs from standard input representing phone contacts made of a name string and a phone number string, and inserts each into a std::map
structure. The name is the key in the map, and the phone number for that person is the value associated with that key in the map. This program will serve as a reference for you to follow as you work with std::map
to complete the remainder of the exercise.
First, compile and link map-simple.cpp
by typing:
g++ -std=c++11 -Wall -Wextra -pedantic map-simple.cpp -o map-simple
Next, run the executable map-simple
using the file input.txt
in place of interactive input using the pipe command-line operation by typing:
cat input.txt | ./map-simple
Now, look inside input.txt
and then the code in map-simple.cpp
and try to to understand how it works. Here are a few items we want to highlight:
- Notice that four (name,phone) pairs are present in the input file, but only three remain in the map when it is output. This illustrates an important point about maps: only one occurrence of a key value is allowed (that is, the original value for the key Sara is replaced by the second value supplied later in the input file).
- We make use of a
const_iterator
namedit
to traverse the map entries so we can output them. Each entry is a pair, and it runs over each of them in succession, so when we accessit->first
, we get the current entry’s key (i.e. name), and when we accessit->second
, we get the current entry’s value (i.e. phone number). - Note also that the values in the map are output in ascending order by key (i.e. name) value, even though they were not inserted in that order. We did not need to explicitly sort them in our code, either.
A map will always organize its entries in ascending order by key.
Before moving on to the next portion, make sure you follow what is happening in this code. Rushing to the next part without understanding this code will not help you finish this exercise any more quickly! Please ask questions if something does not make sense.
Part 3
Now, you will work to edit main.cpp
to work with maps in a more sophisticated way.
In this part, you’ll add code to read in words from standard input until end of file occurs, and populate the map<string, int>
named counters
so that each entry has a key which is a word collected and the corresponding value is the number of times that word appears in the file. The code to output the counters map contents is already present in this file.
Once you’ve modified it, you’ll compile main.cpp
using:
g++ -std=c++11 -Wall -Wextra -pedantic main.cpp -o main
Next, run the executable main
using the file monkeys.txt
in place of interactive input by typing:
cat monkeys.txt | ./main
Note that the contents of monkeys.txt
are as shown below:
and Mama called the doctor and the doctor said no more monkeys
So, if your code works correctly, you should see the following output:
word Mama has 1 occurrences
word and has 2 occurrences
word called has 1 occurrences
word doctor has 2 occurrences
word monkeys has 1 occurrences
word more has 1 occurrences
word no has 1 occurrences
word said has 1 occurrences
word the has 2 occurrences
Part 4
Once you have the counters map properly populated, and you confirmed that your output is correct, you can now work on rearranging that data you collected during Part 3.
In this part, you need to write code that will go through the simple counters
map you populated and copy over the data into a new map called words_by_freq
, but rearrange your data so that each entry in the new map has an integer key, and has an entire vector of strings as its value. The vector will hold all strings with the frequency indicated by the key integer. Recall that the push_back
method in std::vector
allows you to add an item at the end of a vector.
Once you have done this properly, if run with monkeys.txt
as input, your words_by_freq
map will contain entries like this:
key (an int) | value (a vector of strings) |
---|---|
1 | Mama, called, monkeys, more, no, said |
2 | and, doctor, the |
You can think of the words_by_freq
map as a notebook of information, where each entry is a page, and at the top of the page, you wrote an integer frequency, and on the rest of the page, you’ve written a list of all the words you encountered with that frequency.
To confirm that your code is correct, move onto Part 5 where you will write the code to output its contents.
Part 5
Using const_iterators
(not just one but two!), write code to output the contents of the words_by_freq
map you populated. This means that for each entry in the map, you will need to output the frequency key, and then iterate separately over the strings in the vector. (Happily, iterators can be used over vectors as well as maps!)
So, if the input words are the contents of monkeys.txt
, then the output would from this part should be:
Frequency: 1
Mama
called
monkeys
more
no
said
Frequency: 2
and
doctor
the
If some frequency does not exist in the table, then your code should not output it. For example, if your input was the following words:
the tall tall tall giraffe
Then the output from this part would be:
Frequency: 1
giraffe
the
Frequency: 3
tall
The words for each frequency shown in the sample output above are shown in alphabetical (dictionary) ordering. Recall that unlike std::map
a std::vector
is not automatically stored in order based on <
, so think about why this is.
Hint: we did not use std:sort
or even go out of our way at all to make this happen!
Part 6
Copy your implementation of sort
from ex23 into sort.cpp
.
Part 7
Invoke STL’s sort implementation to sort the contents in the vector vec2
.
Be sure to include any necessary headers.
Part 8
Compare the performance of your sort
implementation to the one provided by STL.
To do this you will need to compile
g++ -std=c++11 -Wall -Wextra -pedantic sort.cpp -o sort
and then run
./sort
your code.
To get a reasonable sense of the relative performance, it makes sense to use a large random vector (e.g. with millions of entries).
Remember to add and commit to your local repo copy as your work. Push to your remote repo when finished. Also scp and submit to Gradescope to check your solution. Use exit
to logout from your ugrad account when finished. If you continue to work on the program after class, make sure to keep your repo updated as well!