您好,欢迎来到六九路网。
搜索
您的当前位置:首页中国英语学习者语料库CLEC(桂诗春杨惠中)

中国英语学习者语料库CLEC(桂诗春杨惠中)

来源:六九路网
实用标准文档

中国英语学习者语料库

CLEC收集了包括中学生、大学英语4级和6级、专业英语低年级和高年级在内的5种学生的语料一百多万词,并对言语失误进行标注。其目的就是观察各类学生的英语特征和言语失误的情况,希望通过定量和定性的方法对中国学习者英语作出较为精确的描写,为我国学生的英语教学提供有用的反馈信息。

表1 CLEC语料分布 类型 词次 ST2 208088 ST3 209043 ST4 212855 ST5 214510 ST6 226106 总计 1070602

言语失误标注原则

1. 简单合理,易于系统操作。参与标注的人比较多,分类表过于繁

复,就难于掌握。我们采取两级分类,第一级有11类:词形(fm)、动词短语(vp)、名词短语(np)、代词(pr)、形容词短语(aj)、副词(ad)、介词短语(pp)、连词(cj)、词汇(wd)、搭配(cc)、句子(sn)。每一类里再用数目字细分。如[cc]为词语搭配不当,[cc1]表示名词和名词的搭配,[cc2]表示名词和动词的搭配,[cc3]表示动词和名词的搭配,等等。

2. 分类表的类别要适中。过粗容易统一,但信息太少,不利于分析学习者的失误/过细难以统一,容易把同一种失误归到不同类别。目前我们采取的办法是对常见的失误从细(如vp和np都有9小类),对少见的失误从粗(如cj只有两小类)。现在的分类表有61个失误码,是属于中等规模的分类表。 提供足够的失误信息(失误本身、失误类型和失误发生范围)。例如In the past, people are [vp6, 4-] kind to each other…, 失误用方括号表示,放在失误之后。 [vp6]为vp(动词)第6种(时态)失误,4-为失误发生的范围,-表示失误的位置,4表示失误前有4个词。要联系这4个词,才能判断are这个词用错了。

开放性。容许研究者根据需要对失误类型进行补充或进一步再分出细类。例如[sn8]为句子结构有缺陷,研究者可以对这种失误再分为若干细类来研究。这需要把sn8的失误全部检索出来,然后定出第三级的分类范畴,如sn81,sn82,等等。

文案大全

实用标准文档

5. 对语体或失误的来由暂不作标注,因为这需要标注者较多的主观判断,更难以统一。

言语失误分类表(总数:61)

词形 码 类型 码 vp1 fm1 Spelling fm2 word building vp2 fm3 capitalization vp3 动词短语 类型 pattern set phrase agreement finite/non-finite non-finite tense voice 码 np1 np2 np3 np4 np5 np6 np7 名词短语 类型 pattern set phrase agreement case 码 代词 类型 pr1 Reference pr2 anticipatory it pr3 Agreement pr4 Case vp4 vp5 vp6 vp7 countability pr5 wh- number article pr6 Indefinite vp8 mood np8 quantifiers vp9 modal/auxilianp9 other ry determiners 形容词短语 副词 介词短语 连词 码 类型 码 类型 码 类型 码 类型 pattern ad1 order pp1 pattern cj1 pattern aj1 aj2 set ad2 modification pp2 set phrase cj2 set phrase phrase aj3 degree ad3 degree aj4 -ed/-ing confusion aj5 predicati ve/attributive 词语 码 类型 order part of speech wd3 wd4 wd5 wd6 wd7 文案大全

搭配 码 cc1 cc2 cc3 cc4 类型 noun/noun noun/verb verb/noun adj/noun verb/adv adv/adj 码 sn1 sn2 sn3 sn4 sn5 sn6 sn7 句子 类型 run-on sentence sentence fragment dangling modifier illogical comparison topic prominence Coordination Subordinatiowd1 wd2 substitution absence redundancy cc5 repetition cc6 ambiguity 实用标准文档

n sn8 sn9 structural deficiency Punctuation

标注说明

码 分 类 类 别 说 明 fm1 word Spelling(拼写) spelling, coinage, abbreviation, apostrophe fm2 word word buildingderivation, inflection, compounding, (构词) plurality (noun), irregularity(verb), 3rd person singular form(verb), syllabification, hyphenation, word division or fusion fm3 vp1 word Capitalization(大小写) vb phr Pattern(及物性型式) lower initial letter for upper initial letter or vice versa error in transitivity(vi as vt or vice versa), transitive verb pattern/ grammatical(cf Oxford advanced learner’s dictionary of current English edited by A. S. Hornby) vp2 vp3 vp4 vp5 vb phr set phrase(固定词组) vb phr Agreement(主谓一致性) vb phr finite/non-finite(定式) vb phr non-finite(不定式) phrasal verb and verbal phrase: error in form or use number agreement with its subject (noun or pronoun) finite verb for non-finite verb or vice versa infinitive error: form and use/ infinitive for participle or vice versa/ -ed participle for -ing participle or vice versa error in tense use within a sentence/ the sequence of tenses between sentences error in the use of voice: active for passive or vice versa error in the use of mood: imperative, vp6 vb phr Tense(时态) vp7 vp8 文案大全

vb phr voice (语态) vb phr Mood(语气) 实用标准文档

subjunctive/ improper structure of conditional sentences vp9 vb phr modal/auxiliarymisuse of modal/auxiliary verbs/ wrong (情态) form of modal verb(or auxiliary verb) and verb combination (e.g tense form, voice form, etc) np1 nn phr Pattern(名词型Error in combination with other 式) words/grammatical np2 nn phr set phrase(固定omission or replacement of a fixed 词组) element that goes after a certain noun np3 nn phr Agreement(主谓number agreement of a noun with its 一致性) determiner or a word that refers to it np4 nn phr Case(格) possessive case error: form or use np5 nn phr Countability(可uncountable noun used as countable 数性) noun np6 nn phr Number(数) countable noun used with no determiner or -s/ a or -s with plural noun np7 nn phr Article(冠词) a/an confusion or definite/indefinite confusion np8 nn phr Quantifiers(数misuse or confusion between many/much, 量词) (a) few/(a) little, some/any, etc np9 nn phr other misuse or confusion of demonstratives, determiners(其wh- determiners, numerals, etc. 他限定词) pr1 pron Reference(指称) incorrect/ambiguous pronoun reference/anaphoric pr2 pron anticipatory itimproper or wrong use of anticipatory (先行it) it / it replaced by a demonstrative, etc pr3 pron Agreement(主谓number agreement with a noun it refers 一致性) to pr4 pron Case(格) case error of any personal pronoun pr5 pron wh-(wh-代词) misuse or confusion of interrogative, relative and conjunctive pronouns pr6 pron Indefinite(不定misuse or confusion of indefinite 式) pronouns such as all/both, few/little, some/any, either/neither, etc aj1 adj Pattern(形容词error in the combination with other 型式) words/grammatical aj2 adj set phrase(固定error in the idiomatic use of an 词组) adjectival phrase/ omission or replacement of a fixed element that goes after a certain adjective 文案大全

实用标准文档

aj3 aj4 adj adj aj5 adj ad1 ad2 ad3 pp1 pp2 cj1 cj2 wd1 wd2 wd3 adv adv adv prep prep conj conj word word word Degree(级) -ed/-ing confusion(-ed/-ing混淆) predicative/attributive(谓语/定语) Order(词序) Modification(修饰语) Degree(级) Pattern(介词型式) set phrase(固定词组) Pattern(连词型式) set phrase(固定词组) Order(词序) part of speech(词类) Substitution(替代) Absence(缺少) Redundancy(冗余) Repetition(重复) Ambiguity(歧义) n/n collocation(名词/名词) n/v collocation(名词/动词) v/n collocation(动词/名词) a/n collocation(形容词/名词) v/ad collocation(动词/副词) adjective degree error: form and use -ed adjective for -ing adjective or vice versa predicative adjective used as attributive adjective improper adverb placement/wrong position adjective modifier used as verb modifier/ other kinds of confusion adverb degree error: form and use unacceptable combination with other words/grammatical error in the formation or use of an idiomatic prepositional phrase unacceptable combination with other words/grammatical error in the formation or use of a phrase functioning as a conjunction misplacement of any word other than an adverb error in part of speech: right root but wrong word class error in word choice: right word class but wrong selection (any part of speech) omission of a word(any part of speech) oversuppliance of a word(any part of speech) unnecessary repeating of a word not clear word meaning/semantic improper noun(phrase) and noun(phrase) combination/semantic improper noun(phrase) and verb(phrase) combination/semantic improper verb and noun(phrase) combination/semantic improper adjective and noun(phrase) combination/semantic improper verb and adverb (or ad/v) combination/semantic wd4 wd5 wd6 wd7 cc1 cc2 cc3 cc4 cc5 word word word word notional notional notional notional notional 文案大全

实用标准文档

cc6 sn1 sn2 sn3 sn4 sn5 sn6 sn7 sn8 notionaad/a l collocation(副词/形容词) sentencrun-on sentencee (不断句) sentencsentence e fragment(片段) sentencdangling e modifier(垂悬修饰语) sentencillogical e comparison(比较不符合逻辑) sentenctopic e prominence(主题突出) sentencCoordination(并e 列) sentencSubordinatione (主从) sentencstructural e deficiency(结构缺陷) sentencPunctuation(标e 点符号) improper adverb and adjective combination/semantic improper addition of clauses/fused sentence subordinate clause as a sentence/ any phrase as a sentence illogical adverbial modification of a clause error in the comparison of words or phrases in a sentence which can not be compared the co-occurrence of an initial noun phrase and its equivalent(usually a pronoun) in the same sentence faulty parallelism of clauses (or words/phrases) in a sentence faulty attachment of a subordinate clause to the main clause error in the grammatical construction of a sentence: improper splitting, pattern shifting, confusing structure, etc overuse, absence, choice, apostrophe, comma splice, etc. sn9

标准化处理后的各种失误频数及其比例

失误类型 st2 fm1 fm2 fm3 vp1 vp2 vp3 vp4 vp5 vp6 vp7 vp8 文案大全

st4 st5 总计 百分比(%) 1686.1928.8 2877.4 2112.6 1826.7 7 10432.2 17.47 349.3 448.9 438.9 226.9 328.7 1792.7 3 1474.4 731.8 405.8 694.1 174.6 3480.7 5.83 259.4 325.9 498.4 103.4 200.8 1387.9 2.32 179 139.3 61.2 104.2 22.1 505.8 0.85 374 524.6 785.2 273.1 327 2283.9 3.82 140.8 159.1 110.8 63.9 51.6 526.2 0.88 140 118.7 107.4 .9 46.7 502.7 0.84 1165.7 356 311.6 379.8 215.6 2428.7 4.07 172.7 104.1 98.4 63.9 46.7 485.8 0.81 27.1 16.3 8.3 25.2 11.5 88.4 0.15 st3 st3 实用标准文档

vp9 np1 np2 np3 np4 np5 np6 np7 np8 111.4 274.3 278.5 42.9 86.1 793.2 46.9 33.5 28.9 16.8 10.7 136.8 24.7 22.4 17.4 19.3 2.5 86.3 202.1 247.7 249.6 210.9 186 1096.3 66.8 55.9 26.4 22.7 21.3 193.1 58.9 98 71.9 60.5 84.4 373.7 374 6.4 481 358.8 3.1 2222.3 237.9 107.5 .3 174.8 .9 6.4 35 65.4 47.9 13.4 7.4 169.1 1.33 0.23 0.14 1.84 0.32 0.63 3.72 1.11 0.28 np9 pr1 pr2 pr3 pr4 pr5 pr6 aj1 aj2 aj3 aj4 aj5 ad1 ad2 ad3 pp1 pp2 cj1 cj2 Wd1 Wd2 Wd3 Wd4 Wd5 Wd6 Wd7 cc1 cc2 Cc3 Cc4 Cc5 Cc6 Sn1 Sn2 Sn3 文案大全6.4 82 16.7 52.5 74.8 26.3 9.5 6.4 9.5 38.2 16.7 0.8 35.8 42.2 7.2 136.1 25.5 27.8 4 43.8 324.6 1102 1634.7 585.6 410.6 27.1 261.8 72.4 35 168.7 .5 23.9 17.5 419.3 424.9 10.3 41.3 236.5 78.3 .2 37 53.3 2.6 18.9 3.4 39.6 2.6 3.4 96.3 37.8 12 98 262.3 20.6 7.7 151.3 929.6 829.8 613.1 37 430.8 65.4 177.1 514.2 94.6 40.4 12 596.8 3.6 20.6 12.4 7.6 5.7 73.4 205 .9 18.9 632.3 23.1 4.2 0 122.3 172.7 28.6 60.6 368.6 20.7 48.7 10.7 191.9 14.1 7.6 10.7 112 5 3.4 0 20.5 15.7 5 9 55 9.9 5.9 7.4 36.1 32.2 43.7 97.5 251.2 22.3 12.6 5.7 59.9 7.4 1.7 0 13.3 39.7 27.7 15.6 215.1 12.4 9.2 4.9 106.5 9.9 1.7 2.5 33.3 43 169.7 28.7 475.5 143.8 37 27.9 496.5 18.2 21.8 12.3 100.7 13.2 5.9 4.9 35.7 114.1 25.2 37.7 372.1 772.8 226.9 242.6 2496.5 1815 757.1 359.8 5668.6 443.8 403.3 427 26.5 518.2 265.5 171.3 1978.7 22.3 34.5 29.5 150.4 261.2 228.6 209.8 1392.2 76 23.5 36.1 273.4 49.6 6.7 21.3 2.7 417.4 75.6 112.3 1288.2 134.7 42 39.3 375.1 29.8 5 4.1 103.2 6.6 2.5 1.6 40.2 576.9 118.5 42.6 17.1 303.3 132.8 76.2 1326.8 17.4 2.5 10.7 61.5 0.12 1.06 0.2 0.62 0.32 0.19 0.03 0.09 0.06 0.42 0.1 0.02 0.36 0.18 0.06 0.8 0.83 0.17 0.06 0.62 4.18 9.49 4.5 3.31 0.25 2.33 0.46 0.49 2.16 0.63 0.17 0.07 2.94 2.22 0.1

实用标准文档

Sn4 Sn5 Sn6 Sn7 Sn8 Sn9 总计

24.9 6.6 20.2 4.9 74.1 14.6 17.4 2.5 4.9 48.9 41.3 39.7 41.2 1.6 208.1 55.9 63.6 23.5 3.3 195.6 446.3 862.1 493.2 231.9 3137.1 573.6 337.2 9.5 322.9 2744.9 6633.14105.2 16160.6 13935.9 8883.4 8 59718.9 17.5 9.5 84.3 49.3 1103.6 861.7 0.12 0.08 0.35 0.33 5.25 4.6 100 按大类区分言语失误排列表 总计 百分比 累积百分比 st2 st3 st4 st5 st6 词形 3752.5 4058.1 2957.3 2747.7 2190 15705.6 26.299 26.299 词汇 2755.5 4626.3 3947.4 1941.1 1477.7 14748 24.696 50.995 句法 2980.4 2163.6 2224.2 1483.9 699 9551.1 15.993 66.988 动词 2570.1 2018.3 2259.8 1146.3 1008.1 9002.6 15.075 82.063 名词 1052.7 1326.1 1024.8 884.8 727 5015.4 8.398 90.461 搭配 382 903.7 714.1 155.3 214.7 2369.8 3.968 94.429 代词 261.8 461.9 440.6 182.4 100.9 1447.6 2.424 96.853 介词 161.6 360.3 186.8 206.7 56.6 972 1.628 98.481 形容词 71.6 67.9 87.5 68.9 119.6 415.5 0.696 99.177 副词 85.2 146.1 62 38.6 23 3.9 0.594 99.771 连词 31.8 28.3 31.4 27.7 17.2 136.4 0.228 99.999 总计 14105.2 16160.6 13935.9 8883.4 6633.8 59718.9 99.999 百分比 0.24 0.27 0.23 0.15 0.11

中国学习者最常见的言语失误 类型 st2 st3 st4 st5 st6 总计 百分比fm1 1928.8 2877.4 2112.6 1826.7 1686.7 10432.2 1wd3 1102 1634.7 1815 757.1 359.8 5668.6 fm3 1474.4 731.8 405.8 694.1 174.6 3480.7 sn8 1103.6 446.3 862.1 493.2 231.9 3137.1 sn9 861.7 573.6 337.2 9.5 322.9 2744.9 wd4 585.6 829.8 443.8 403.3 427 26.5 wd2 324.6 929.6 772.8 226.9 242.6 2496.5 vp6 1165.7 356 311.6 379.8 215.6 2428.7 vp3 374 524.6 785.2 273.1 327 2283.9 np6 374 6.4 481 358.8 3.1 2222.3 wd5 410.6 613.1 518.2 265.5 171.3 1978.7 fm2 349.3 448.9 438.9 226.9 328.7 1792.7 文案大全

实用标准文档

sn1 wd7 vp1 sn2 cc3 np3 vp9 np7 pr1 419.3 261.8 259.4 424.9 168.7 202.1 111.4 237.9 82 596.8 430.8 325.9 3.6 514.2 247.7 274.3 107.5 236.5 576.9 261.2 498.4 303.3 417.4 249.6 278.5 .3 205 118.5 228.6 103.4 132.8 75.6 210.9 42.9 174.8 .9 42.6 209.8 200.8 76.2 112.3 186 86.1 .9 18.9 17.1 1392.2 1387.9 1326.8 1288.2 1096.3 793.2 6.4 632.3 从上表可看出,

1. 词形的3种失误(拼写、构词、大小写)均在其中,而拼写更是居榜首,占失误中的17.47%。3种失误合并共占20.57%。 2. 词汇失误7种中有5种(替代、缺少、词类、冗余、歧义),占失误中的23.81%。

3. 句法失误9种中有4种(结构缺陷、标点符号、不断句、片段),占失误中的15.01%。

4. 动词词组9种中有4种(时态、主谓不一致、及物性、情态),占失误中的11.%

5. 名词词组9种中有3种(数、主谓不一致、冠词),占6.67%。 6.

其他失误(动词/名词搭配、代词指称),占3.22%。

频数 词 15 LIMITED 15 NOTICE 15 OURSELVES 15 PERSONNEL 15 STUDENTS 14 CALENDAR 14 CAUGHT 14 CENTURY 14 COMPETITION 14 FIRST 14 FURTHERMORE 14 MAGAZINES 频数 词 12 WRITING 11 ARTICLE 11 CONTRARY 11 EXERCISE 11 FAVORITE 11 INSTEAD 11 MASTER 11 PARENT 11 PRACTISE 11 RESOURCE 11 TRAVEL 10 CONDITION 中国学习者最常见拼写失误表 频数 词 频数 词 379 MORTALITY 23 THEMSELVES 113 KNOWLEDGE 21 FESTIVAL 78 POLLUTION 20 BELIEVE 76 ENVIRONMENT 20 COUNTRY 69 NOWADAYS 19 ESPECIALLY 68 GOVERNMENT 19 FAMILIAR 56 MODERN 19 REMEMBER 44 PRACTICE 18 COURSE 44 SOMETHING 18 EXERCISES 41 POLLUTED 18 HASTILY 37 BEAUTIFUL 18 INDUSTRY 36 COUNTRIES 18 OFTEN 文案大全

实用标准文档

36 STUDYING 35 CHALLENGE 34 TECHNOLOGY 32 BENEFIT 32 EUTHANASIA 30 BECAUSE 28 LANTERNS 28 REALIZE 27 COLLEGE 26 INTERESTING 25 COMMODITIES 25 LANTERN 25 SUDDENLY 24 IMPORTANT 18 SEVERAL 18 TRADITIONAL 17 CREATE 17 GRAMMAR 17 NECESSARY 17 PEOPLE 17 SATURDAY 17 THEORETICAL 17 THOUGHT 16 CONTROL 16 CONVENIENT 16 POPULATION 16 WILLIAM 15 BEGINNING 14 MEDICINE 14 UNIVERSITY 13 FINANCIAL 13 GREAT 13 MOREOVER 13 OPPORTUNITY 13 PRACTICAL 13 RECEIVED 13 YOURSELF 12 EXPECTANCY 12 FACTORIES 12 OPPORTUNITIES 12 PRACTICES 12 TRANSPORTATION 10 DECREASED 10 ENERGY 10 HAPPINESS 10 INDIVIDUALS 10 PURSUE 10 RAISE 10 SHOULD 10 SUCCESS 10 THEREFORE 10 TRAVELING 10 WASTE 10 WHETHER

中国学习者词汇失误表 失误类型 Wd1 Wd2 Wd3 Wd4 Wd5 Wd6 Wd7 St2 St3 St4 St5 St6 总计 百分比 43.8 151.3 114.1 25.2 37.7 372.1 0.62 324.6 929.6 772.8 226.9 242.6 2496.5 4.18 1102 1634.7 1815 757.1 359.8 5668.6 9.49 585.6 829.8 443.8 403.3 427 26.5 4.5 410.6 613.1 518.2 265.5 171.3 1978.7 3.31 27.1 37 22.3 34.5 29.5 150.4 0.25 261.8 430.8 261.2 228.6 209.8 1392.2 2.33

文案大全

实用标准文档

文案大全

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- 69lv.com 版权所有 湘ICP备2023021910号-1

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务