C++ Map (Homework assignment help!)

Shadowssong

[H]ard|Gawd
Joined
Sep 17, 2009
Messages
1,969
So my assignment is to take from standard input (from shell "$proj1.x < test0") a stream of letters and numbers. I am supposed to sort these and keep track of the numbers, characters and strings. My professor suggested we use a map (I've never used it before, been doing as much research as possible) as it would be the most effecient.

So from what I've learned this is how I am going to read it into the map, the problem is that I do not know how to increment the counter for the map pair (my understanding is that the char/num/string is to be unique so I would put it as the left value, and use the right value as the count for that specific char/num/string). This is just a snippet from my code. Can someone tell me how I would go about incrementing the right hand value? I am pretty confused on this assignment.

Code:
	//Declarations
	map<char, int> char_map;
	map<char,int>::iterator char_iter;
        char c;

Then in a while loop I peek for the next char, then onto this code:
Code:
if(isalpha(c))	//initial check for alpha
		{
			cin.get(c);	//grab the first char
			while(isalpha(c))
			{
				char_iter=char_map.find(c);		//Assign value from search
				if(iter != char_map.end())	//If the element exists
				{
					char_map[iter] += 1;	//Modify the count of the element HOW???
				}
				else
				{
					char_map[c] = 1;	//If it doesn't exist,create element with val 1
				}
				cin.get(c);
			}

The input file is:
AAA001aaA1AAa01aaa01bb

Any ideas? Am I implementing this wrong? I want to go talk to my professor but I don't think I will be able to go see him before it is due (classes and I have to go out of town). I am pretty much teaching myself the implementation of the map class, and my book is very little help (and I can't find any relevant examples online oddly enough)
Thanks so much !
-Greg
 
It's hard to guess what you really mean by "sort these and keep track of the numbers, characters and strings". Given the input file you name, what output are you supposed to generate? Without fully understanding the problem you're trying to solve, it's impossible to tell if you're going about solving it correctly or not.
 
Thanks for the quick reply, this is what the assignment says, sorry I didn't state it clearly, I'm very flustered with this.

The file proj1.cpp should contain the main function, int main(). In the main() function, the program should read the input until it reaches the end, counting the number of times each word, number, and character is used. A word is defined as a sequence of letters ('a'..'z' or 'A'..'Z'). Words are case insensitive ("AA", "Aa", "aA", and "aa" are the same). A number is defined as a sequence of digits ('0'..'9'). Note that both words and numbers can be of length of 1, that is, contain one letter or one digit, respectively. Different sequences represent different numbers. For example, number "001" is different from number "1". Words are separated by numbers or other non-letter and non-digit characters. Numbers are separated by words or other non-letter and non-digit characters. Your program should record the number of times each word, number, and character happens. The program should then output the ten most used characters, the ten most used numbers, and the ten most used words as well as the number of times these characters/numbers/words are used. Since words are case insensitive, the program only outputs lower case words. The characters, numbers and words should be outputted in the descending order based on the number of times they are used. When two characters happen in the same number of times, the character with a smaller ASCII value should be considered as being used more frequently. When two words (numbers) happen in the same number of times, the word (number) that occurs earlier in the input should be considered as being used more frequently.

I appreciate any help you can provide mike, you rock :D
 
And heres the output from running my professors executable:

@linprog3.cs.fsu.edu:~/mystuff/COP4530/temp> proj1.x < test0
Total 5 different characters, 5 most used characters:
No. 0: A 6
No. 1: a 6
No. 2: 0 4
No. 3: 1 4
No. 4: b 2

Total 2 different words, 2 most used words:
No. 0: aaa 4
No. 1: bb 1

Total 3 different numbers, 3 most used numbers:
No. 0: 01 2
No. 1: 001 1
No. 2: 1 1
 
You're on the right track, then; you've got the part done where you count each character an figure out how many times you've seen it.

You need to do more work to see if you're in a word or not, and once you end a word, tally that, too. You'll need more than just a single character to form and record a word, obviously, so you'll need some temporary storage and another map.
 
You're on the right track, then; you've got the part done where you count each character an figure out how many times you've seen it.

You need to do more work to see if you're in a word or not, and once you end a word, tally that, too. You'll need more than just a single character to form and record a word, obviously, so you'll need some temporary storage and another map.

Yeah I cut out some of the other stuff I have. These are ALL the declarations I have

Code:
	//Declarations
	map<char, int> char_map;
	map<string, int> word_map;
	map<string, int> number_map;
	map<char,int>::iterator char_iter;
	map<string,int>::iterator word_iter;
	map<string,int>::iterator num_iter;

	char c;
	bool word_flag = false;
	bool num_flag = false;

So I am thinking that I would use some kind of flag to signify when I am in a word (single letters are words as well).

I also have:
Code:
string word_string;	//create string to be placed in stringmap

Code:
word_string = word_string + tolower(c);	//concat the string with the char
Would this work? I get an error:
Code:
proj1.cpp:48: error: no match for ‘operator+’ in ‘word_string + tolower(((int)c))’
Does this mean that I can't use the string classes + operator when I use tolower?


What command would I use to increment the count on the map value? It seems that what I have is giving me an error:

Code:
proj1.cpp:42: error: no match for ‘operator[]’ in ‘char_map[char_iter]’


Thanks so much mike, I wish we had a rep system, I would give you 1000000000000 rep.
 
I just found this snippet of code on wiki:

Code:
int main()
{
    map<string, int> wordcounts;
    string s;
 
    while (cin >> s && s != "end")
        ++wordcounts[s];
 
    while (cin >> s && s != "end")
        cout << s << ' ' << wordcounts[s] << '\n';
}

Does this code mean that instead of searching for a value to see if it exists and incrementing it that I can just say like

char_map[c]++;

Instead of the whole

Code:
char_iter=char_map.find(c);		//Assign value from search
				if(char_iter != char_map.end())	//If the element exists
				{
					//++char_map[char_iter];	
				}
				else
				{
					char_map[c] = 1;	
				}

code I have?

EDIT:
New code (much smaller, can't believe I didn't see this on wiki):
Code:
		if(isalpha(c))	//initial check for alpha
		{
			cin.get(c);			//grab the first char
			string word_string = "";		//create string to be placed in stringmap
			//word_flag = true;
			while(isalpha(c))	
			{
				++char_map[c];		//increment counter or create new pair
				temp = tolower(c);
				word_string = word_string + temp;	//concat char to string
                                cin.get(c);
			}
			++word_map[word_string];	//increment counter or create new pair
		}
 
Last edited:
If you check the documentation for std::map::eek:perator[], you'll see that it inserts a new element if one isn't found. So, yes; you don't need to test for existence before incrementing.

You're making progress; keep at it. I think you'll next find that your loops aren't quite right and that you are missing some of the characters you're meant to be counting.
 
If you check the documentation for std::map::eek:perator[], you'll see that it inserts a new element if one isn't found. So, yes; you don't need to test for existence before incrementing.

You're making progress; keep at it. I think you'll next find that your loops aren't quite right and that you are missing some of the characters you're meant to be counting.

Yeah I am having a problem with actually stopping the main loop. Right now I have it terminate at b which is the last char in the input file.
Here is my full code (seems to be working alright, not complete though):

Code:
	while((c = cin.peek()) != 'b')
	{
		if(isalpha(c))	//initial check for alpha
		{
			cin.get(c);			//grab the first char
			string word_string = "";		//create string to be placed in stringmap
			while(isalpha(c))	
			{
				cout << "Reading in char: " << c << endl;
				++char_map[c];		//increment counter or create new pair
				temp = tolower(c);
				word_string = word_string + temp;	//concat char to string
				cin.get(c);
			}
			++word_map[word_string];	//increment counter or create new pair
		}
		if(isdigit(c))
		{
			while(isdigit(c))
			{
				cout << "Reading in int: " << c << endl;
				++num_map[c];
				cin.get(c);
			}
		}
	}

I also omitted the output code that I have at the bottom.

What would I use to tell the main loop to stop grabbing characters? I tried /n and EOF, not sure what else to try.
 
If you check the documentation, you'll see where get() does in response to the stream ending.
 
I have it reading everything correctly, my main problem is now going to be the output as when I output I am supposed to sort it by the value, instead of the key (which is the word/char/num). From what I have read it would require swapping into another map and making the key the value and the value the key, but that wouldn't make sense because the value would not be unique (therefore wouldn't be able to swap accurately). Would I need to create my own function that sorts the values and prints them out? I am confused on how to print this output in order :(
 
For printing it out, instead of using a map<int, char> you could use a map<int, vector<char> > so you could store more than one character per number. and then sort by that.
 
For printing it out, instead of using a map<int, char> you could use a map<int, vector<char> > so you could store more than one character per number. and then sort by that.

I don't understand what you mean, sorry. :( I don't have any map<int, chars>. My numbers are currently stored as strings (so I can store something like 001 or 01). The map auto sorts it by key value (which is the character/string/number_string stored) but I am supposed to sort it by the number of occurrences of that word (which would be the value), and if it is ever tied for occurrences then sort it by ASCII value or something along those lines.

I can't switch the map from map<string, int> to map<int, string> and then sort it that way because there are multiple occurrences of counts of words, and since you have to have unique values, it wouldn't work. I will ask my teacher or TA tomorrow. Thanks for the reply!
 
I have it reading everything correctly, my main problem is now going to be the output as when I output I am supposed to sort it by the value, instead of the key (which is the word/char/num).
This is a pretty common problem with data structures. A single data structure is going to get you accessibility or ordering along one value, and is convenient for processing. Reporting often ends up being a different data structure because it requires a different ordering, not natural to the processing data structure.

From what I have read it would require swapping into another map and making the key the value and the value the key, but that wouldn't make sense because the value would not be unique (therefore wouldn't be able to swap accurately). Would I need to create my own function that sorts the values and prints them out? I am confused on how to print this output in order :(

When you sort, you'll want to swap both the key you're sorting on and the rest of the data associated with that key. You're right that a std::map won't help because a std::map enforces uniqueness over the keys; you can only have one key with a given value. You'll have to use another data structure which allows multiple identical keys. A std::multimap is such a structure. You'll need to work out how to do the sorting. You can also copy your data to an array, or use an array of references back into the original map.
 
Mike, my professor said it wouldn't even be worth moving the whole map into another structure such as a multimap because we only have to search for the top 10 occurring items. I was thinking that if I could figure out how to find the top 10 elements that I could assign their key to a vector of that type and then sort according to that key's value. And then print out the vector and its corresponding value in the map. Would this be a logical approach?
 
There are algorithms which can help you get the top 10, but you'd end up wanting a different data structure, anyway.
 
My TA said that a quick fix would be to just go through each map and print out the highest value and then remove it and print the next highest, and repeat until 10 have been reached. It seems like when my professor said that the map would be the most efficient he didn't actually mean it would work in the sorting.
 
I got this working except for one thing, I am sorting it until I have two values who have the same count. Instead of sorting it by when it occurs in the file, I am sorting it alphabetically (which makes more sense to me but I guess that doesn't matter). It also works fairly efficiently on large inputs (over 2 million characters).
Any ideas on how I would sort it by size AND if it ties sort it by which one came first in the input? I have no idea because the map automatically sorts it as soon as I store it.
Thanks again for the help mike.
 
Ripping through the list 10 times causes 10*n comparisons. Doing a sort causes n*log(n) comparisons.

You can't create a single std::map that sorts on two different comparisons.
 
Ripping through the list 10 times causes 10*n comparisons. Doing a sort causes n*log(n) comparisons.

You can't create a single std::map that sorts on two different comparisons.

Yeah I realize that it is extremely un-efficient but without the proper time to rework it in order to sort it correctly I am stuck with this. Looks like I am stuck with what I have, guess I will just take the 5-10 point hit that will come with not sorting correctly the ones that have the same count. Thanks again for all your help!
 
What would be the correct sort for items with the same count? I'm not sure what the hang up is.
 
Oh I have to sort according to when it occurs in the input when they have the same count.
 
Back
Top