# | Problem | Pass Rate (passed user / total user) |
---|---|---|
14333 | Basic - Ingredient Organizer |
|
14334 | Advanced - TFIDF |
|
14335 | Medium - Wasai Language Converter |
|
Description
This is a basic question.
Imagine you are a chef preparing for a royal banquet. You have a variety of ingredients needed for the dishes. However, you want to organize these ingredients in a specific way. Write a program to help you with this task.
Input
Input one line, with each ingredient separated by commas.
Output
Count the frequency of each ingredient, then sort the ingredients by frequency in ascending order. Finally, output the ingredients whose frequencies are odd, along with their respective frequencies.
Sample Input Download
Sample Output Download
Tags
Discuss
Description
This is an advanced question.
TF-IDF (Term Frequency-Inverse Document Frequency) is a statistical measure used to evaluate the importance of a word in a document relative to a collection of documents (corpus). It combines two metrics: term frequency (TF), which counts how often a word appears in a document, and inverse document frequency (IDF), which measures the rarity of the word across the corpus. A higher TF-IDF score indicates that the word is significant in the document but rare in the corpus. This technique helps identify the most relevant words in a document, enhancing text analysis and information retrieval tasks. TF-IDF is widely used in natural language processing and search engine algorithms.
In this problem, we will calculate TF-IDF using a simplified method.
Input
Multiple lines of documents (sentences), you will not know the number of input.
- TF is defined as the number of times a word appears in the document divided by the total number of words in the document.
- IDF is defined as the total number of documents divided by the number of documents containing the word.
- Print the TF-IDF on each document of the most frequently occurring word of all documents.
- Ignore the upper case and the lower case. str.lower() can make the whole string into lower case, str is the string.
Example:
Input:
I love cats
You like orange cats and black cats
They don't like animals
'cats' is the most frequently occurring word.
'cats' TF on first document = TF('cats', 1)
The number of 'cat' in document 1 is 1
The total number of words in document 1 is 3
TF('cats', 1) = 1/3 = 0.3333333333333333
TF('cats', 2) = 2/7 = 0.2857142857142857
TF('cats', 3) = 0/4 = 0.0
The total number of document is 3.
The number of document including 'cats' is 2.
IDF('cats') = 3/2 = 1.5
The TFIDF of 'cats' on document 1 = TFIDF('cats', 1)
TFIDF('cats', 1) = 0.3333333333333333 * 1.5 = 0.5
TFIDF('cats', 2) = 0.2857142857142857 * 1.5 = 0.42857142857142855 => 0.43
TFIDF('cats', 3) = 0.0 * 1.5 = 0.0
Output
Print the most frequently occurring word's TFIDF on the number of documents and round to the second decimal place.
The final results should round to the second decimal place.
We recommend using python instead of a calculator to compute the value yourself, since the result might be different due to the floating point problems.
If you want to calculate 0.1+0.2, you can use print(0.1+0.2).
Sample Input Download
Sample Output Download
Tags
Discuss
Description
This is a medium question.
Fin and Jack continued their adventure in the mystical land of Wasai. After crossing the mysterious bridge, they arrived at the other side of Wasai, discovering lush green forests and crystal-clear lakes. Deep within the forest, they encountered a peculiar creature, a colorful rabbit named Lina, adorned with spots of all colors, resembling a rainbow in the sky.
They wondered if there were human settlements nearby and attempted to communicate with Lina. However, they found themselves unable to understand each other. Just as they felt puzzled, they stumbled upon an ancient and mysterious object that explained how to convert their language into one that Lina could understand through manipulation:
Given a line of text, you are required to parse it and perform the following operations:
- Count the frequency of each character in the text (including spaces).
- If multiple characters have the same frequency, order them according to their first occurrence in the text.
- Encode the characters according to their frequency, replacing each character with a corresponding lowercase letter from the alphabet (a-z) in alphabetical order.
- Example input:
hello john how are you
- Example output:
cdeeabfacgbcahbijdbkal
- After counting and sorting the characters based on their frequency and order of occurrence, we have:
- 'o', ' ' occurs 4 times
- 'h' occurs 3 times
- 'e', 'l' occurs 2 times
- 'j', 'n', 'w', 'a', 'r', 'y', and 'u' occur 1 time each
- Encoding the characters:
- 'o' -> 'a'
- ' ' -> 'b'
- 'h' -> 'c'
- 'e' -> 'd'
- 'l' -> 'e'
- 'j' -> 'f'
- 'n' -> 'g'
- 'w' -> 'h'
- 'a' -> 'i'
- 'r' -> 'j'
- 'y' -> 'k'
- 'u' -> 'l'
Input
The input will consist of a line of text.
- 1 <= len(text) <= 10^4
- The text will only contain lowercase alphabetical characters and spaces.
- The number of character types in each input will not exceed 26.
Output
The output should be the encoded version of the input text, with the most frequent character replaced by 'a', the second most frequent character replaced by 'b', and so on.