Ericounet 2998ea58e2 reset of the repository 1 year ago
..
hua.umf.maine.edu 2998ea58e2 reset of the repository 1 year ago
READER.DB 2998ea58e2 reset of the repository 1 year ago
READER.FSL 2998ea58e2 reset of the repository 1 year ago
READER.RSL 2998ea58e2 reset of the repository 1 year ago
READER.sch 2998ea58e2 reset of the repository 1 year ago
README.HTM 2998ea58e2 reset of the repository 1 year ago
README.TXT 2998ea58e2 reset of the repository 1 year ago
c1.html 2998ea58e2 reset of the repository 1 year ago
file.sql 2998ea58e2 reset of the repository 1 year ago
reader2-utf8.csv 2998ea58e2 reset of the repository 1 year ago
reader2.gnumeric 2998ea58e2 reset of the repository 1 year ago
reader2.txt 2998ea58e2 reset of the repository 1 year ago

README.HTM


















 











  Ting Chinese English Home  |   China Index  |   Topics  |   Database Download  





A Chinese-English Database



Beginning Chinese


When you first look at Chinese, the characters look different but you can't seem

to remember them.  It is like trying to remember all of the lines you made in a

doodle.  Some people can remember the English meaning for a Chinese character but not the

Pinyin, while some can remember the Pinyin but not the Chinese character -- my favorite is

when I can remember the Pinyin for a character but not the English meaning.  In

order to get all three parts to hang together you have to study from English to

Character, English to Pinyin, Pinyin to Character, Character to Pinyin, etc. 



The second problem I found when starting Chinese was that different dictionaries use
different fonts (calligraphic styles, such as Kai, Song, and Wei).  When you have

only studied one form, it is difficult to readily recognize
different forms of the characters.  When you only look at kaishu
characters it limits your ability to pick out the distinctive

features of the characters. 



The distinctive features are those characteristics which are

necessary for the recognition and discrimination of characters from other characters.
In the Roman alphabet the fact that both d and b have lines and half circles are

common features, the fact that one line is on the left and one on the right is a distinctive

feature.  It is also important that a certain amount of line appear above the half circle

-- but the amount may vary.  When writing Chinese characters there is also a great

amount of variation possible -- wait until you see Chinese handwriting! However, as

in the Roman alphabet, the variations have limitations.  The list of rules governing

which lines can or cannot be shortened, moved slightly, or be

curved would be immense.  The only efficient way to pick it up is to begin reading
different standard fonts and expose yourself to different handwriting styles. 



Another difficulty faced by English speakers learning Chinese is the

'measure word'.  Measure words indicate units of items; some can be easily

translated (bei1 = cup, glass) while others cannot be translated (zheng1 = unit of

something flat like a desk or a piece of paper).  Just to make

life interesting there are words which take multiple measure words -- those

which are colloquial and those which are seldom found outside of books!

The introductory books I have read soft-pedal the measure words
by teaching only a few of the most common in the first

two years, but warn that if you don't use the correct one you will not

be considered literate; or worse, not be understood.



HSPACE = 20 SRC = "mw-db.gif">

I believe that the measure

words should be learned with the noun.  We learn le and la with French nouns and

if you ask a native speaker of Chinese what the measure word is for a noun they say
the word in a phrase

"one (unit-measure word) (noun)" "yi4 ben3 shu1" "yi4 zhan3 tai2deng1" [one (mw)
book; one (mw) table lamp.  The sound of the phrase cues the memory. 







How the database can help



In order to help myself with these three problems, I began a database of Chinese

vocabulary.  I can switch fonts periodically, randomize words, print out study sheets,

and add sentences to put vocabulary into context.  It is a working model.  I am

constantly adding words and elaborating the thing.  There are typos left, I am sure.
There is one nice feature that may make up for the typos -- I have added measure

words for many of the nouns.  I have not yet found a reference work which does

this; I badger my Chinese friends and collect them as I read.


The database grew to include sound files, more words, sentences, and comments. You can use the full database by visiting Ting -- Chinese - English Study Center




The original database can be downloaded in several forms. They can be used as a limited dictionary by using the search function in Paradox or other program.  Each contains about 1,200 entires.  Two versions are available:  One for Paradox with reports and a simple flash card, and one in TAB DELIMITED format which can be imported into many different programs. 



Both versions of the downloadable databases are based on the Simplified character version of the Practical Chinese Reader Series:  Books I and II from Beijing Language Institute. 






1) Chinese.zip
-- just the

vocabulary from Practical Chinese Reader I and II

1a) Chintxt.zip
2) Chinese2.zip
-- the Practical Chinese Reader I and II
with extra words and phrases.  In this form the

component single syllable words are included for many of the compound words.  I found it

easier to remember vocabulary in this way and to figure out the meaning of new

words.


2a) Chintxt2.zip

Use them freely for personal or educational applications -- they may not be used in any

other program or commercial product without permission.





Both versions are described below. 





The Paradox databases include fields for:





English keyword
single word reference for

sorting


English definition
Chinese Character
Pinyin without tone
(this field is handy for searching by

pinyin without

entering the tone marks)


Chinese Pinyin with real tone marks

Chinese Pinyin with number tones
Measure word pinyin with number tones

Measure word Chinese character,
Known
(this field is designed to place an "x" on those

words you know well so you can exclude them from further review.  When you use the form, click on

this field and the character will turn to black on white.  If you hit F9 you

can then put x in the field to indicate that you know it.  Then with a slight

knowledge of Paradox, you can filter those words.)


Book

Chapter
(The field allows you to sort list by the main vocabulary and exclude the supplementary vocabulary.)


Chapter First Seen
(In the Reader, words occur in text or

supplemental lists before they are found in the regular vocabulary.  This field allows you to include them in your chapter lists.)

To use them you should have:





  1. Paradox 5 or 10 (both support Chinese, the versions between don't)  Paradox allows you to use multiple fonts within the same table, making
    it ideal for a database which includes Pinyin, Characters, and English.

  2. TwinBridge 3.1 or 4.0 or some other Chinese system.  You will find
    sources at the bottom of this page.  I have set the font on
    the Chinese characters to JSong (GB code) which comes with the standard TwinBridge package -- if you have another Chinese system you will have to change the font in the Chinese Character field and in the Measure Word field.  If you have the Language Pack installed in Windows, you can view the characters by changing the font to MS Song or SimSun, by right-clicking on the field with the weird characters.

  3. Arial MT -- This is optional, but lets you see and print the tone
    marks (there is also a numbered pinyin field if you don't have it) -- Arial MT is a Chinese Pinyin TTFont designed to be used with any Windows program.  It is available through Cheng and Tsui, a mail-order house in Massachusettes for Asian language materials.  Ask for PinTone -- it costs about $15 --
    well worth it whether you are a student or a teacher. There is no standard pinyin font at the moment. While UNICODE includes pinyin, it is not easily typed. Several freeware pinyin fonts are available on the Internet. Each is based on redefining the keyboard and each differs slightly. Do a search in Alta Vista or Google for --> pinyin font.
  4. Probably some patience.
  5. Chinese.zip (or chinese*.zip) (download it below)

In the Chinese.zip you will find:




reader.db
The database (unzips to 2.5 megs or 3.1 megs for chinese2.zip)

reader.rsl
Prints out vocabulary -- four words per page. The page is

designed to be a tri-fold so that you can view English, Character, or Pinyin

and test your knowledge of the other two.  There are four words

per page so they can be shuffled to minimize context -- or cut into

floppy index cards.  You can easily change this to get up to 20 words

on a page.

reader.fsl
The form is a handy way to review vocabulary on the

computer.


reader.tv
A Paradox file.

reader.fam
A Paradox file.

readme.htm and readme.txt
This file.


Copy these files into your working directory for Paradox, start TwinBridge, and

open the database in Paradox. 



The text versions (tab delimited) include fields for:




English Definition
Chinese Character
Chinese Pinyin With Number Tones
Part of Speech
Measure Word Chinese Character
Book
Chapter
Chinese Pinyin With Real Tone Marks
Measure Word Pinyin With Number Tones

To use them you should have:



  1. TwinBridge 3.1 or 4.0 or some other Chinese system.  You will find

    sources at the bottom of this page.  Or the Language Pack from Windows can be used for viewing.

  2. Arial MT -- This is optional, but lets you see and print the tone

    marks (there is also a numbered pinyin field if you don't have it)
  3. Almost any spreadsheet or database program can import a tab delimited database table.

Acknowledgements



Liu Yu Rong, Shou Danni, and Feng Xie of Beijing Polytechnic University
and Wang Hong Fang of Farmington have all helped to either proof-read or

add measure words to the database.



Paradox is a trademark of Borland International. www.borland.com



TwinBridge 3.5 and Chinese Partner 4.0 are products of the TwinBridge

Software Corporation www.twinbridge.com






Download

Chinese.zip (standard vocabulary) -- 83,200+ k






Download

Chinese2.zip (expanded vocabulary) -- 101,900+ k





Download

Chintxt.zip (text version expanded vocabulary) -- 101,900+ k





Download

Chintxt2.zip (text version expanded vocabulary) -- 101,900+ k








Reading Chinese in the database and on the Internet



To read the characters in the GB code fields in the database, you must have a Chinese reader for

Windows.  Several are available.  To get a demo of TwinBridge Chinese Partner go to
TwinBridge.  I have a more extensive discussion of Chinese systems for multiple platforms in the FAQ.

If you are using TwinBridge, make sure

that the option "Map all characters to English" is chosen in the configuration menu.  This will
make it possible to see GB coded characters in Netscape and other Windows applications without
special Chinese fonts.

Once you have a system or helper working, you can see the characters more clearly in
the Chinese fields in the database
if you enlarge the font.  In Paradox, right click on the field and a choice box will come up.

Choose Font and then Size -- 18 gives a very readable font for the screen. 



To learn more about reading Chinese in Windows, on the Internet and in Netscape you should visit

these pages:










Return to the China Page and Main

Menu


Return to the Ting - Chinese-English Menu






http://

hua.umf.maine.edu/China/database.html
Last update:
JANUARY 2002
© Marilyn Shea 1996, 1999, 2002