The CVT is composed of over
one million
words. The
following are three word lists that summarize the CVT. The first list
includes all of the words in the entire CVT. The second list is
comprised of all the words in the children’s literature corpus. The
third list includes all the words in the newspaper corpus. **Please note
that certain tones and vowels have been formatted to be read by the
concordance program during the analysis process. For a complete list of
the formatting changes, see the Font Coding System** Words are
listed in order from most to least frequent. Information on number of
occurrences and percent of occurrence in the entire CVT are included.
Although all steps have been taken to make this information accessible
to the reader, these word lists are rather extensive. It is advisable to
print only the portions or pages that interest you. It is permissable to
print and use the CVT for non-profit research and educational purposes
providing the appropriate citation to this website.
Please use the tabs to navigate your way into the Vietnamese Children's
Literature Corpus and Vietnamese Newspaper Corpus
Citation:
Tang, G. (2006). Corpora of Vietnamese Texts. Retrieved from www.vnspeechtherapy.com