3017 - 2024GEC1506

#	Problem	Asker	Description	Reply	Replier	Reply Time	For all team
1	14334 - Advanced - TFIDF	GEC_1506_Jessie	Students who don't need to write the advanced problem.	108095035 109095029 110012023 110048218 110048235 110081007 110091016 110091025 111012062 111048123 111048135 111072263 111099067 111191025 112006267 112041007 112048227 112048235 112072144 112072219 112072244 112090006 112090024 112090045 112593236	GEC_1506_Jessie	2024/05/27 13:31:32

#	Problem	Pass Rate (passed user / total user)
14333	Basic - Ingredient Organizer
14334	Advanced - TFIDF
14335	Medium - Wasai Language Converter

14333 - Basic - Ingredient Organizer

Status | Limits Submit

	Time	Memory
Case 1	1 sec	32 MB
Case 2	1 sec	32 MB
Case 3	1 sec	31 MB
Case 4	1 sec	32 MB
Case 5	1 sec	32 MB

Description

This is a basic question.

Imagine you are a chef preparing for a royal banquet. You have a variety of ingredients needed for the dishes. However, you want to organize these ingredients in a specific way. Write a program to help you with this task.

Input

Input one line, with each ingredient separated by commas.

Output

Count the frequency of each ingredient, then sort the ingredients by frequency in ascending order. Finally, output the ingredients whose frequencies are odd, along with their respective frequencies.

Sample Input Download

tomato, garlic, tomato, tomato, cheese, garlic

Sample Output Download

cheese: 1
tomato: 3

14334 - Advanced - TFIDF

Status | Limits Submit

	Time	Memory
Case 1	1 sec	32 MB
Case 2	1 sec	32 MB
Case 3	1 sec	32 MB
Case 4	1 sec	32 MB
Case 5	1 sec	32 MB

Description

This is an advanced question.

TF-IDF (Term Frequency-Inverse Document Frequency) is a statistical measure used to evaluate the importance of a word in a document relative to a collection of documents (corpus). It combines two metrics: term frequency (TF), which counts how often a word appears in a document, and inverse document frequency (IDF), which measures the rarity of the word across the corpus. A higher TF-IDF score indicates that the word is significant in the document but rare in the corpus. This technique helps identify the most relevant words in a document, enhancing text analysis and information retrieval tasks. TF-IDF is widely used in natural language processing and search engine algorithms.
In this problem, we will calculate TF-IDF using a simplified method.

Input

Multiple lines of documents (sentences), you will not know the number of input.

TF is defined as the number of times a word appears in the document divided by the total number of words in the document.
IDF is defined as the total number of documents divided by the number of documents containing the word.
Print the TF-IDF on each document of the most frequently occurring word of all documents.
Ignore the upper case and the lower case. str.lower() can make the whole string into lower case, str is the string.

Example:

Input:
I love cats You like orange cats and black cats They don't like animals

'cats' is the most frequently occurring word.
'cats' TF on first document = TF('cats', 1)
The number of 'cat' in document 1 is 1
The total number of words in document 1 is 3
TF('cats', 1) = 1/3 = 0.3333333333333333
TF('cats', 2) = 2/7 = 0.2857142857142857
TF('cats', 3) = 0/4 = 0.0

The total number of document is 3.
The number of document including 'cats' is 2.
IDF('cats') = 3/2 = 1.5

The TFIDF of 'cats' on document 1 = TFIDF('cats', 1)
TFIDF('cats', 1) = 0.3333333333333333 * 1.5 = 0.5
TFIDF('cats', 2) = 0.2857142857142857 * 1.5 = 0.42857142857142855 => 0.43
TFIDF('cats', 3) = 0.0 * 1.5 = 0.0

Output

Print the most frequently occurring word's TFIDF on the number of documents and round to the second decimal place.

The final results should round to the second decimal place.
We recommend using python instead of a calculator to compute the value yourself, since the result might be different due to the floating point problems.
If you want to calculate 0.1+0.2, you can use print(0.1+0.2).

Sample Input Download

I love cats
You like orange cats and black cats
They don't like animals

Sample Output Download

0.5
0.43
0.0

14335 - Medium - Wasai Language Converter

Status | Limits Submit

	Time	Memory
Case 1	1 sec	32 MB
Case 2	1 sec	32 MB
Case 3	1 sec	32 MB
Case 4	1 sec	32 MB
Case 5	1 sec	32 MB

Description

This is a medium question.

Fin and Jack continued their adventure in the mystical land of Wasai. After crossing the mysterious bridge, they arrived at the other side of Wasai, discovering lush green forests and crystal-clear lakes. Deep within the forest, they encountered a peculiar creature, a colorful rabbit named Lina, adorned with spots of all colors, resembling a rainbow in the sky.

They wondered if there were human settlements nearby and attempted to communicate with Lina. However, they found themselves unable to understand each other. Just as they felt puzzled, they stumbled upon an ancient and mysterious object that explained how to convert their language into one that Lina could understand through manipulation:

Given a line of text, you are required to parse it and perform the following operations:

Count the frequency of each character in the text (including spaces).
If multiple characters have the same frequency, order them according to their first occurrence in the text.
Encode the characters according to their frequency, replacing each character with a corresponding lowercase letter from the alphabet (a-z) in alphabetical order.

Example input:

hello john how are you

Example output:

cdeeabfacgbcahbijdbkal

After counting and sorting the characters based on their frequency and order of occurrence, we have:

'o', ' ' occurs 4 times

'h' occurs 3 times

'e', 'l' occurs 2 times

'j', 'n', 'w', 'a', 'r', 'y', and 'u' occur 1 time each

Encoding the characters:

'o' -> 'a'

' ' -> 'b'

'h' -> 'c'

'e' -> 'd'

'l' -> 'e'

'j' -> 'f'

'n' -> 'g'

'w' -> 'h'

'a' -> 'i'

'r' -> 'j'

'y' -> 'k'

'u' -> 'l'

Input

The input will consist of a line of text.

1 <= len(text) <= 10^4
The text will only contain lowercase alphabetical characters and spaces.
The number of character types in each input will not exceed 26.

Output

The output should be the encoded version of the input text, with the most frequent character replaced by 'a', the second most frequent character replaced by 'b', and so on.

Sample Input Download

hello john how are you

Sample Output Download

cdeeabfacgbcahbijdbkal