Pronunciation Assessment (streaming version) API documentation
Interface description
The ability interface for automatic evaluation of pronunciation level, pronunciation errors, defect localization and problem analysis through intelligent speech technology. The core technologies involved can be mainly divided into two parts: automatic evaluation technology of Chinese Mandarin pronunciation level and automatic evaluation technology of English pronunciation level.
- Get the authentication code:
apply for appid from IFLYTEK open platform and add (streaming interface) to get the interface key APIKey and APISecret
- Integration of Websocket interface:
generic interface + parameter description, Chinese and English test question format will be different,
see test question format description test question format description
Interface Demo
Example demo Please click here to download.
Currently, we only provide demos for some development languages, please refer to the following interface documentation for other languages.
Interface requirements
Contents | Description |
---|---|
request protocol | ws |
request address | ws://ise-api-ko.xf-yun.com/v2/ise |
Interface Authentication | Signature mechanisms, see Interface Authentication below for details. |
Development Languages | Any, as long as you can initiate Websocket requests to Our Cloud Services |
Audio Properties | Sampling Rate 16k, Bit Length 16bit, Mono |
audio format | pcm, wav, mp3 (need to change the value of aue to lame), speex-wb; |
Audio size | Audio data sending session length cannot exceed 5 minutes |
Language Type | Chinese, English |
Interface call flow
- Parameter uploading phase, as detailed in the description of business parameters
(business):
Parameter first upload, data.status=0,and set cmd="ssb"; - Audio upload phase, during which audio data uploading begins:
The first frame of audio needs to be set with cmd="auw", aus=1, data.status=1;
Intermediate frame audio needs to be set with cmd="auw", aus=2, data.status=1;
The last frame of audio needs to be set with cmd="auw", aus=4, and data.status=2;
Interface authentication
In the handshake phase, the requestor needs to sign the request, and the server verifies the legitimacy of the request through the signature.
Authentication Methods
By adding authentication related parameters after the request address. Example url:
ws://ise-api-ko.xf-yun.com/v2/ise?authorization=YXBpX2tleT0ia2V5eHh4eHh4eHh4eDhlZTI3OTM0ODUxOWV4eHh4eHh4eHh4IiwgYWxnb3JpdGhtPSJobWFjLXNoYTI1NiIsIGhlYWRlcnM9Imhvc3QgZGF0ZSByZXF1ZXN0LWxpbmUiLCBzaWduYXR1cmU9Im1WemtxWitBOTVFRlVoOGlCTENzUVI3WDhVKzNwUGc3dVF1amIwZlhYck09Ig==&date=Tue%2C+22+Dec+2020+06%3A29%3A31+GMT&host=ise-api-ko.xf-yun.com
Authentication Parameters:
Parameters | Type | Required | Description | Example |
---|---|---|---|---|
host | string | Yes | request host | ise-api-ko.xf-yun.com |
date | string | Yes | Current timestamp, RFC1123 format | Wed, 10 Jul 2019 07:35:43 GMT |
authorization | string | Yes | Signature-related information using base64 encoding(signature is calculated based on hmac-sha256) | See rules for generating authorization parameters below |
authorization parameter detailed generation rules
(1) Get the interface keys APIKey and APISecret.
In the console page, you can check it after registration.
(2) The format of the parameter authorization base64 encoding before
(authorization_origin) is as follows.
api_key="$api_key",algorithm="hmac-sha256",headers="host date request-line",signature="$signature"
where api_key is the APIKey obtained on the console, algorithm is the encryption
algorithm (only hmac-sha256 is supported), and headers are the parameters
involved in signing (see comments below).
signature is a string that is signed using a cryptographic algorithm and encoded
with base64 for the parameters involved in the signature, see below.
Note: headers are the parameters involved in signing; note that it is the fixed
parameter names ("host date request-line"), not the values of these parameters.
(3) The rules for the original field (signature_origin) of the signature are as follows.
The signature raw field consists of the host, date, and request-line parameters
stitched together in the format
The format of the splice is (\n is a line break, ':' followed by a space):
host: $host\ndate: $date\n$request-line
suppose that...
request url = ws://ise-api-ko.xf-yun.com/v2/ise
date = Wed, 10 Jul 2019 07:35:43 GMT
Then the signature original field (signature_origin) is:
host: ise-api-ko.xf-yun.com
date: Wed, 10 Jul 2019 07:35:43 GMT
GET /v2/ise HTTP/1.1
(4) Sign signature_origin using hmac-sha256 algorithm in combination with apiSecret to get signed digest signature_sha.
signature_sha=hmac-sha256(signature_origin,$apiSecret)
where apiSecret is the APISecret obtained in the console
(5) Encode signature_sha using base64 encoding to get the final signature.
signature=base64(signature_sha)
Assumptions
APISecret = secretxxxxxxxxxx2df7900c09xxxxxxxxxxx
date = Wed, 10 Jul 2019 07:35:43 GMT
Then the signature is
signature=mVzkqZ+A95EFUh8iBLCsQR7X8U+3pPg7uQujb0fXXrM=
(6) Based on the above information, splice the string before authorization base64 encoding (authorization_origin), the example is as follows.
api_key="keyxxxxxxxxxx8ee279348519exxxxxxxxxx", algorithm="hmac-sha256",
headers="host date request-line", signature="mVzkqZ+A95EFUh8iBLCsQR7X8U+3pPg7uQujb0fXXrM="
Note: headers are the parameters involved in signing; note that it is the fixed parameter names ("host date request-line"), not the values of these parameters.
(7) Finally, the authorization_origin is base64 encoded to get the final authorization
parameter.
authorization = base64(authorization_origin)
Example:
authorization=
YXBpX2tleT0ia2V5eHh4eHh4eHh4eDhlZTI3OTM0ODUxOWV4eHh4eHh4eHh4IiwgYWxnb3JpdGhtPSJobWFjLXNoYTI1NiIsIGhlYWRlcnM9Imhvc3QgZGF0ZSByZXF1ZXN0LWxpbmUiLCBzaWduYXR1cmU9Im1WemtxWitBOTVFRlVoOGlCTENzUVI3WDhVKzNwUGc3dVF1amIwZlhYck09Ig==
authentication url example (Java)
public static String getAuthUrl(String hostUrl, String apiKey, String apiSecret) throws Exception {
URL url = new URL(hostUrl);
SimpleDateFormat format = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z", Locale.US);
format.setTimeZone(TimeZone.getTimeZone("GMT"));
String date = format.format(new Date());
//String date = format.format(new Date());
//System.err.println(date);
StringBuilder builder = new StringBuilder("host: ").append(url.getHost()).append("\n").//
append("date: ").append(date).append("\n").//
append("GET ").append(url.getPath()).append(" HTTP/1.1");
//System.err.println(builder);
Charset charset = Charset.forName("UTF-8");
Mac mac = Mac.getInstance("hmacsha256");
SecretKeySpec spec = new SecretKeySpec(apiSecret.getBytes(charset), "hmacsha256");
mac.init(spec);
byte[] hexDigits = mac.doFinal(builder.toString().getBytes(charset));
String sha = Base64.getEncoder().encodeToString(hexDigits);
//System.err.println(sha);
String authorization = String.format("api_key=\"%s\", algorithm=\"%s\", headers=\"%s\", signature=\"%s\"", apiKey, "hmac-sha256", "host date request-line", sha);
//System.err.println(authorization);
HttpUrl httpUrl = HttpUrl.parse("https://" + url.getHost() + url.getPath()).newBuilder().//
addQueryParameter("authorization", Base64.getEncoder().encodeToString(authorization.getBytes(charset))).//
addQueryParameter("date", date).//
addQueryParameter("host", url.getHost()).//
build();
return httpUrl.toString();
}
Authentication results
If the handshake is successful, HTTP 101 status code will be returned, indicating that
the protocol upgrade is successful; if the handshake fails, different HTTP Code status
codes will be returned according to different error types, and at the same time carry
the error description information, the detailed error descriptions are as follows:
HTTP Code | Description | Error Message | Resolution |
---|---|---|---|
401 | Missing authorization parameter | {"message": "Unauthorized"} | Check for authorization parameter, see authorization parameter generation rules authorization parameter generation rules |
401 | Signature Parameter Parsing Failed | {"message": "HMAC signature cannot be verified"} | Check if each parameter of the signature is missing or not, especially make sure that the Is the copied api_key correct? |
401 | Signature verification failed | {"message": "HMAC signature does not match"} | Signature verification failed, there are many possible reasons. 1. check if api_key,api_secret are correct 2. check if the parameters host, date, request-line to calculate the signature are spliced according to the protocol requirements. 3. check whether the base64 length of signature signature is normal(normal 44 bytes). 3. Check whether the base64 length of signature is normal (normal 44 bytes). |
403 | Clock offset verification failed | {"message": "HMAC signature cannot be verified,a valid date or x-date header is required for HMAC Authentication"} | Check if the server time is standardized, a difference of more than 5 minutes will report this error |
403 | IP whitelist validation failed | {"message": "Your IP address is not allowed"} | Can disable IP whitelisting on the console, or check if the IP address set in IP whitelisting is an external IP address of the local machine. |
Handshake failure return example:
HTTP/1.1 401 Forbidden
Date: Thu, 06 Dec 2018 07:55:16 GMT
Content-Length: 116
Content-Type: text/plain; charset=utf-8
{
"message": "HMAC signature does not match"
}
Interface data transmission and reception
After a successful handshake the client and server will establish a websocket
connection, the client can upload and receive data at the same time through the
websocket connection.
//Connection successful, start sending data
int frameSize = 1280; // the size of each audio frame, it is recommended to send 1280B every 40ms, the size can be adjusted, but do not exceed 19200B, that is, after the base64 compression can not be more than 26000B, otherwise, it will be reported as an error 10163 data length error.
int intervel = 40;
int status = 0; // status of the audio
try (FileInputStream fs = new FileInputStream(file)) {
byte[] buffer = new byte[frameSize];
//Send Audio
- The websocket-version supported by the server is 13, please make sure the
framework used by the client supports this version.
- All frames returned by the server are of type TextMessage, which corresponds to opcode=1 in the protocol frame of native Websocket, please make sure that the frame type parsed by the client must be of this type, if not, please try to upgrade the version of the client framework, or change the technical framework.
- If there is a frame-splitting problem, that is, a json packet is returned to the client in multiple frames, resulting in failure of the client to parse the json. Most of the time, this problem is caused by the client's framework to Websocket protocol parsing problems, if it occurs, please try to upgrade the framework version, or replace the technical framework.
- client session ends if you need to close the connection, try to ensure that the Websocket error code passed to the server side is 1000 (if the client-side framework does not provide an interface to pass the error code when closing. )(If the client-side framework does not provide an interface to pass the error code when closing, then there is no need to focus on this article).
- Please note that the number of bytes in a frame size is different for different audio formats, we suggest: uncompressed PCM format, 40ms interval between each audio transmission, 1280B bytes per audio transmission; the size can be adjusted, but the maximum should not exceed 19200B, i.e., after base64 compression, it should not be more than 26000B, or else it will be reported as Error 10163 Data Too Long Error.
Request Parameters
The request data are all json strings
parameter name | type | mandatory | description |
---|---|---|---|
common | object | YES | public parameter that is only uploaded when the first frame is requested after a successful handshake, see below |
business | object | YES | business parameter that is uploaded when the first frame is requested after a successful handshake and when the subsequent data is sent, see below for details |
data | object | YES | business data flow parameter that needs to be uploaded in all requests after a successful handshake, see below |
Public Parameter Description (COMMON)
parameter name | type | mandatory | description |
---|---|---|---|
app_id | string | YES | APPID information applied in the platform |
Description of business parameters (business)
Parameter Name | Type | Mandatory | Description | Example |
---|---|---|---|---|
sub | string | YES | Service type designation ise (open for evaluation) | "ise" |
ent | string | YES | CHINESE:cn_vip ENGLISH:en_vip | "cn_vip" |
category | string | YES | CHINESE QUESTION TYPE: read_syllable(single word reading, for Chinese only) read_word(words reading) read_sentence(sentences reading) read_chapter(passage reading) ENGLISH QUESTION TYPE: read_word(words reading) read_sentence(sentences reading) read_chapter(passage reading) simple_expression(English circumstances reading) read_choice(English multiple-choice) topic(English free-response) retell(English retelling) picture_talk(English figure speaking) oral_translation(English oral translation) | "read_sentence" |
aus | int | YES | When uploading audio to distinguish the state of the audio (When cmd = auw , the audio upload stage is a mandatory parameter) 1:he first frame of the audio 2:the middle of the audio 4:the last frame of the audio | value according to the upload stage |
cmd | string | Yes | used to differentiate between data upload stages ssb:parameter upload stage ttp:text upload stage (this stage can be skipped when ttp_skip=true, and the text in the text field will be used directly) auw: audio upload stage | value according to the upload stage |
text | string | YES | text to be reviewed utf8 encoding, need to add utf8bom at head | '\uFEFF'+text |
tte | string | YES | text-encoding-to-be-reviewed utf-8 gbk | "utf-8" |
ttp_skip | bool | YES | Skip ttp and use the text in ssb directly for evaluation (use in conjunction with cmd parameters to see),default value true | true |
extra_ability | string | NO | Extra_ability (valid condition ise_unite="1",rst="entirety") Multi_dimension score information is displayed (accuracy score, fluency score, completeness score) extra_ability value is multi_dimension (word and sentence are applicable, if more than one is selected, use a semicolon to separate them). dimension (words and phrases are applicable, such as selecting more than one ability, separated by a semicolon;. For example: add("extra_ability"," syll_phone_err_msg;pitch;multi_dimension")) Word base frequency information display (base frequency start value, end value) extra_ability value is pitch, only for word and sentence question types. For word and sentence questions only Phonemic error information is displayed (whether or not the sound and tonal patterns are correct) The value of extra_ability is syll_phone_err_msg (for both word and sentence questions, if more than one ability is selected, use a semicolon to separate them. For example: add("extra_ability"," syll_phone_err_msg;pitch;multi_dimension")) | "multi_dimension" |
aue | string | NO | audio format raw: uncompressed audio in pcm format or wav (if using wav audio, it is recommended to remove the header) lame: audio in mp3 format speex-wb;7: Cyberlink customized audio in speex format(default) | "raw" |
auf | string | NO | audio sample rate default audio/L16;rate=16000 | "audio L16;rate=16000" |
rstcd | string | NO | Return result format utf8 gbk (default) | "utf8" |
group | string | NO | For different groups, different audio scoring results for the same paper (only supported for Chinese words, phrases, sentences, and chapters), this parameter affects the accuracy_score adult (adult group, defaults to adult if no group parameter is set) youth (secondary school group) pupil(Elementary school groups, Chinese sentence and chapter questions will return accuracy_score if this parameter is set.) | "adult" |
check_type | string | NO | Set the score and error checking threshold of the evaluation (only supported by Chinese engine) easy:easy common:common hard:hard | "common" |
grade | string | NO | Set the parameters of the assessment's grade level (Chinese question types only: sentence and chapter types are supported for elementary and middle school) junior(1,2grade) middle(3,4grade) senior(5,6grade) | "middle" |
rst | string | NO | Evaluate the returned results and scale control (Evaluate the returned results and scale control will also be affected by the ise_unite and plev parameters) Complete: entirety (default) Chinese percentile is recommended to pass the parameter (rst="entirety" and ise_unite="1" with extra_ability parameter) English percentile is recommended to pass the parameter (rst="entirety" and ise_unite="1" with extra_ability parameter) English percentile is recommended to pass the parameter (rst="entirety" and ise_unite="2" with extra_ability parameter) unite="1" and use with extra_ability parameter English percentile recommend passing parameter (rst="entirety" and ise_unite="1" and use with extra_ability parameter) Lite: plain (the evaluation will return only total score),for example: <?xml version="1.0" ?><FinalResult><ret value="0"/><total_score value="98.507320"/></FinalResult> | "entirety" |
ise_unite | string | NO | return result control 0: no control (default) 1: control (the extra_ability parameter will affect the return of information such as full dimensions) | "0" |
plev | string | NO | Different values of plev with rst="entirety" (default) and ise_unite="0" (default) have an effect on the returned result. plev: 0 (give all information, Chinese contains rec_node_type, perr_msg, fluency_score, phone_score; English contains accuracy_score, serr_msg, syll_accent, fluency_score, standard_score, and so on). score, standard_score, pitch information returned | "0" |
Example of request parameters:
First data transmission:
{
"common": {
"app_id": "xxxxxxx"
},
"business": {
"aue": "raw",
"auf": "audio/L16;rate=16000",
"category": "read_sentence",
"cmd": "ssb",
"ent": "en_vip",
"sub": "ise",
"text": "[content]When you don't know what you're doing, it's helpful to begin by learning about what you should not do. ",
"ttp_skip": true
},
"data": {
"status": 0
}
}
Request Data Audio Parameters (DATA)
Parameter Name | Type | Mandatory | Description | Example |
---|---|---|---|---|
data | string | Yes | audio data, base64 encoded | audio data, base64 encoded as value |
status | string | yes | status of sent data 0 for the first time 1 for the middle data 2 for the last time | change the value according to the status of sent data |
Follow-up data sending
{
"business": {
"cmd": "auw",
"aus":1
},
"data": {
"status": 1,
"data":"PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4K"
}
}
Request Data Audio Return Parameter Description
return parameter name | type | description |
---|---|---|
sid | string | The id of the current session, the same sid is returned for the same session |
code | int | return code, 0 means the request was successful, when encountering other error codes means the request failed, the client should disconnect immediately to end the session, for details of the error code list, see Error Code |
message | string | The specific type of error description when the error occurred |
data | object | returned data |
data.data | string | Evaluation results, base64 string, parsed to xml format |
status | int | Return the status of the result, when status=2, it means all the results are returned, the client should take the result when status=2 as the final result. |
Return example:
{
"code": 0,
"message": "success",
"sid": "isexxxxxxxxxxxxxxxxxxxxxxxxx",
"data": {
"status": 2,
"data": "<?xml version="1.0" encoding="UTF-8"?>
  <xml_result>
      <read_sentence lan="cn" type="study" version="7,0,0,1024">
          <rec_paper>
              <read_sentence accuracy_score="100.000000" beg_pos="0" content="今天天气怎么样。" emotion_score="87.315361" end_pos="150" except_info="0" fluency_score="87.620300" integrity_score="100.000000" is_rejected="false" phone_score="100.000000" time_len="150" tone_score="100.000000" total_score="92.511200">
                  <sentence beg_pos="0" content="今天天气怎么样" end_pos="150" fluency_score="0.000000" phone_score="100.000000" time_len="150" tone_score="100.000000" total_score="86.959984">
                      <word beg_pos="0" content="今" end_pos="22" symbol="jin1" time_len="22">
                          <syll beg_pos="0" content="fil" dp_message="32" end_pos="1" rec_node_type="fil" time_len="1">
                              <phone beg_pos="0" content="fil" dp_message="32" end_pos="1" rec_node_type="fil" time_len="1"></phone>
                          </syll>
                          <syll beg_pos="1" content="今" dp_message="0" end_pos="22" rec_node_type="paper" symbol="jin1" time_len="21">
                              <phone beg_pos="1" content="j" dp_message="0" end_pos="4" is_yun="0" perr_level_msg="2" perr_msg="0" rec_node_type="paper" time_len="3"></phone>
                              <phone beg_pos="4" content="in" dp_message="0" end_pos="22" is_yun="1" mono_tone="TONE1" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="18"></phone>
                          </syll>
                      </word>
                      <word beg_pos="22" content="天" end_pos="40" symbol="tian1" time_len="18">
                          <syll beg_pos="22" content="天" dp_message="0" end_pos="40" rec_node_type="paper" symbol="tian1" time_len="18">
                              <phone beg_pos="22" content="t" dp_message="0" end_pos="30" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="8"></phone>
                              <phone beg_pos="30" content="ian" dp_message="0" end_pos="40" is_yun="1" mono_tone="TONE1" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="10"></phone>
                          </syll>
                      </word>
                      <word beg_pos="40" content="天" end_pos="58" symbol="tian1" time_len="18">
                          <syll beg_pos="40" content="天" dp_message="0" end_pos="58" rec_node_type="paper" symbol="tian1" time_len="18">
                              <phone beg_pos="40" content="t" dp_message="0" end_pos="46" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="6"></phone>
                              <phone beg_pos="46" content="ian" dp_message="0" end_pos="58" is_yun="1" mono_tone="TONE1" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="12"></phone>
                          </syll>
                      </word>
                      <word beg_pos="58" content="气" end_pos="74" symbol="qi9" time_len="16">
                          <syll beg_pos="58" content="气" dp_message="0" end_pos="74" rec_node_type="paper" symbol="qi0" time_len="16">
                              <phone beg_pos="58" content="q" dp_message="0" end_pos="66" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="8"></phone>
                              <phone beg_pos="66" content="i" dp_message="0" end_pos="74" is_yun="1" mono_tone="TONE0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="8"></phone>
                          </syll>
                      </word>
                      <word beg_pos="74" content="怎" end_pos="84" symbol="zen3" time_len="10">
                          <syll beg_pos="74" content="怎" dp_message="0" end_pos="84" rec_node_type="paper" symbol="zen3" time_len="10">
                              <phone beg_pos="74" content="z" dp_message="0" end_pos="79" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="5"></phone>
                              <phone beg_pos="79" content="en" dp_message="0" end_pos="84" is_yun="1" mono_tone="TONE3" perr_level_msg="2" perr_msg="0" rec_node_type="paper" time_len="5"></phone>
                          </syll>
                      </word>
                      <word beg_pos="84" content="么" end_pos="93" symbol="me5" time_len="9">
                          <syll beg_pos="84" content="么" dp_message="0" end_pos="93" rec_node_type="paper" symbol="me0" time_len="9">
                              <phone beg_pos="84" content="m" dp_message="0" end_pos="88" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="4"></phone>
                              <phone beg_pos="88" content="e" dp_message="0" end_pos="93" is_yun="1" mono_tone="TONE0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="5"></phone>
                          </syll>
                      </word>
                      <word beg_pos="93" content="样" end_pos="150" symbol="yang4" time_len="57">
                          <syll beg_pos="93" content="样" dp_message="0" end_pos="112" rec_node_type="paper" symbol="yang4" time_len="19">
                              <phone beg_pos="93" content="_i" dp_message="0" end_pos="96" is_yun="0" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="3"></phone>
                              <phone beg_pos="96" content="iang" dp_message="0" end_pos="112" is_yun="1" mono_tone="TONE4" perr_level_msg="1" perr_msg="0" rec_node_type="paper" time_len="16"></phone>
                          </syll>
                          <syll beg_pos="112" content="sil" dp_message="0" end_pos="150" rec_node_type="sil" time_len="38">
                              <phone beg_pos="112" content="sil" end_pos="150" time_len="38"></phone>
                          </syll>
                      </word>
                  </sentence>
              </read_sentence>
          </rec_paper>
      </read_sentence>
  </xml_result>"
}
}
Chinese review return parameter description
Question Types | Nodes | Field Information |
---|---|---|
Word and Phrase Question Types (Elementary, Adult) | read_syllable or read_word | phone_score: voice_score tone_score:tone_score total_score: total_score [(phone_score + (tone_score)/2] |
Word and Phrase Questions (Elementary, Adult) | sentence | No Important Information |
Word and Phrase Questions (Elementary, Adult) | word | No Important Information |
Word and Phrase Question Types (Elementary, Adult) | syll | dp_message: 0 Normal;16 Missed; 32 Increase; 64 Readback; 128 Replace; |
word, word question type (primary, adult) | phone | dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 replacement (when dp_message is not 0,perr_msg may appear to keep the same value as dp_message); mono_tone:tone type < ;br>perr_level_msg: return the confidence level of error checking result (total 1,2,3 three values, 1 is the best, 3 is the worst. If there is 0, it can be disregarded) is_yun: 0 vowel, 1 rhyme: when is_yun=0: perr_msg has two statuses: 0 vowel is correct; 1 vowel is incorrect when is_yun=1: perr_msg has four statuses: 0 rhyme and mode are correct; 1 rhyme is incorrect; 2 mode is incorrect. 1 incorrect vowel; 2 wrong key; 3 both wrong vowel and key; |
Sentence and Chapter | read_sentence or read_chapter | accuracy_score:accuracy emotion_score: overall impression (whether the reading is clear and fluent, emotional, etc.) fluency_score: fluency score integrity_score:completeness score phone_score:voice_score tone_score:tonal_score total_score integrity_score:integrity_score phone_score:voice_score tone_score: tone_score total_score: total_score [total_score =accuracy_score*0.4 + fluency_score*0.4 + overall_impression_score*0.4]. [Overall impression score*0.2] |
sentence | sentence | phone_score:voice_score tone_score:tonal_score total_score: total_score[model regression] |
Sentence and Piece Questions (Elementary) | word | no important information |
Sentence Piece Question Type (Elementary) | syll | dp_message: 0 normal; 16 missed; 32 added; 64 readback; 128 replacement; |
sentence type (elementary school) | phone | dp_message: 0 normal; 16 miss; 32 add; 64 readback; 128 replace (when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message); mono_tone: tonal type <br>perr_level_msg: return the confidence level of the error checking result (total 1,2,3three values, 1 is the best, 3 is the worst. If there is 0, it can bedisregarded) is_yun: 0 vowel, 1 rhyme: when is_yun=0: perr_msg has twostatuses: 0 vowel is correct; 1 vowel is incorrect when is_yun=1: perr_msg has four statuses: 0 rhyme and mode are correct; 1 rhyme is incorrect; 2 mode is incorrect. 1 incorrect vowel; 2 wrong key; 3 both wrong vowel and key; |
Sentence Questions (Adults) | read_sentence or read_chapter | fluency_score:fluency score integrity_score: completeness score phone_score:voice_score tone_score. score: tonal score total_score: total_score [model regression] |
sentence | sentence | phone_score: voice_score tone_score:tonal_score total_score: total_score [Model regression] |
Sentence and Piece Questions (Adults) | word | No important information |
Sentence Piece Question Type (Adult) | syll | dp_message: 0 normal; 16 missed; 32 added; 64 readback; 128 replacement; |
sentence-part question type (adult) | phone | dp_message: 0 normal; 16 missed;32 incremental; 64 readback; 128 replacement (when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message); mono_tone: tonal type & gt;perr_level_msg: return the confidence level of the error checking result(total 1,2,3 three values, 1 is the best, 3 is the worst. If there is 0, it can be disregarded) is_yun: 0 vowel, 1 rhyme: whenis_yun=0: perr_msg has two statuses: 0 vowel is correct; 1 vowel is incorrect when is_yun=1: perr_msg has four statuses: 0 rhyme and mode are correct; 1 rhyme is incorrect; 2 mode is incorrect. 1 incorrect vowel; 2 wrong key; 3 both wrong vowel and key; |
English Review Back to Parameter Description
Question Type | Node | Field Information |
---|---|---|
word question type (adult) | read_word | [adult word] total_score: total score[model regression] |
Word Problems (Adults) | sentence | no important information |
word question type (adult) | word | dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 replacement; total_score: score for each word |
word type (adult) | syll | syll_score: score for each syllable serr_msg: syllable error detection [1 or 2049, it means the reading is wrong; when serr_msg=2049, it means both the syllable and the stress are wrong] syll_accent: rereading error detection [0, it means the syllable does not need to be reread; 1, it means the syllable needs to be reread, and the engine will not detect it; if it is 2048 or 2049, it means the reading is wrong, and the engine will parse serr_msg again. If it is 0, the syllable does not need to be reread, and the engine does not detect it; if it is 1, the syllable needs to be reread, and the serr_msg is parsed, if it is 2048 or 2049, it means the syllable is wrongly reread, and the engine is optimizing the effect, so we can not pay attention to this case]. |
word question type (adult) | phone | dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 replacement; |
Sentence and Chapter Questions (Adult) | read_sentence or read_chapter | accuracy_score: accuracy_score standard_score:standard_score fluency_score. integrity_score: integrity score [Adult Sentence] total_score: total_score = (0.6*accuracy_score + fluency_score*0.3+ standard_score*0.1)* integrity_score/100 [Adult Chapter] total_score:total_score = (0.5*accuracy_score + fluency_score*0.3 + standard_score*0.2)*integrity_score/100 |
Sentence and Chapter Questions (Adult) | sentence | accuracy_score: accuracyscore standard_score:standard_score fluency_score:fluency_score [Adult Sentence] total_score. score: total_score =(0.6*accuracy_score + fluency_score*0.3 + standard_score*0.1) [AdultChapter] total_score: total_score = (0.5*accuracy_score + fluency_score*0.3 + standard_score*0.2) |
Sentence, Chapter Questions (Adult) | word | dp_message: 0 normal; 16 missed; 32 incremental; 64 back; 128 replacement; total_score: score for each word Pause, consecutive, rereading, end-of-sentence lifting and lowering check for error: 1. property value in xml with the binary of the Property value in the right table. (The effect of optimization, do not need to pay attention to) 2. If the result of the operation is equal to the Property value in the table above, it means that the type of detection is carried out here. If the result is not equal to the Property value in the above table, then no detection was performed here. (Effect of optimization, do not need to pay attention to) 3. Determine whether the word layer in the xml werr_msg, if it does not appear, that is, read aloud correctly. (In effect optimization, no need to pay attention to) 4. If it appears, then the value of werr_msg in xml and the corresponding value of Werr_msg in the above table will be and operation, if it is still equal to the value of this type, it means that this type of reading is wrong. (effect optimization in progress, no need to pay attention to) |
Sentence and Chapter Questions (Adults) | syll | syll_score: score for each syllable serr_msg: syllable check error [1 or 2049, then it means reading aloud is wrong, the effect is being optimized, you can not pay attention to this case] |
Sentence, Chapter Question Type (Adult) | phone | dp_message: 0 normal; 16 missed; 32 incremental; 64 readback; 128 substitution; |
Scenario Response | rec_paper | total_score: total score [model regression] |
story retelling-topic | rec_paper | total_score: total score [model regression] |
Retelling questions, oral translation, to point questions, looking at pictures | rec_paper | accuracy_score: accuracy score standard_score:standard_score fluency_score: fluency_score integrity_score :integrity_score total_score: total_score [model regression] |
oral essay | rec_paper | total_score: total score [model regression] |
Explanation of the format of the test questions
Explanation of the Chinese test question format
Chinese (read_syllable)
Plain text example:
(1) Without any header and without any node names
(2) Contents that can be included in the test paper: simplified Chinese characters,
traditional Chinese characters (within the range of gbk), 0-9 Arabic numerals (not
recommended), and separators.
(3) The separator is used between two words, and no characters other than Chinese
characters and spaces should appear at the beginning and end of the line.
(4) The content of the test paper may contain 0-9 Arabic numerals, but does not
support the content of the test paper is all Arabic numerals. Numerical values and
strings of numbers above two digits (e.g. year, phone number, time, etc.) are
required to be expressed in Chinese numerals.
(5) The number of Chinese characters in a single line should not exceed 100.
Fung, Ching, Gov.
Example of pinyin labeling:
(1) Words are separated from each other using line breaks.
(2) ü is represented by lv and nv except for lü and nü (e.g., female: nv3), which are
represented by u, e.g., bureau (ju2). üe is represented by ue, e.g., slightly (lue4).
(3) Pinyin should be the correct pinyin in the dictionary, and the tone type should
be taken between 0-9, where 0/5/6/7/8/9 all represent soft tones.
(4) Arabic numerals should not appear in the Chinese character part.
(5) Labeled pinyin must be given for every word in a paper with pinyin.
<customizer: interphonic>
(of an unmarried couple) be close
hao3
be (a certain color)
cheng2
Note: The total number of Chinese characters in the text of the test paper ranges
(0,200], the total number of characters ranges (0,5000], and the recommended
number of Chinese characters in the text ranges (0,100], and the recommended
number of characters (0,200].
Chinese words (read_word)
Plain text example:
(1) Contents that can be included in the test paper: simplified Chinese characters,
traditional Chinese characters (within the range of gbk), 0-9 Arabic numerals (not
recommended), and separators.
(2) The separator is used between two words, and no characters other than Chinese
characters and spaces should appear at the beginning and end of the line.
(3) The content of the test paper may contain 0-9 Arabic numerals, but does not
support the content of the test paper is all Arabic numerals. Numerical values and
strings of numbers above two digits (e.g., year, phone number, time, etc.) are
required to be expressed in Chinese numerals.
(4) The number of Chinese characters in a single line should not exceed 100.
Rather, not difficult.
Example of pinyin labeling:
(1) Words are separated from each other using line breaks.
(2) What can be included in a test paper: simplified Chinese characters, pinyin, and
pinyin separator (|).
(3) Pinyin should be the correct pinyin in the dictionary, and the tone type should
be taken between 0-9, where 0/5/6/7/8/9 all represent soft tones.
(4) Use the "|" symbol to separate the pinyin of words within a single word.
(5) Arabic numerals should not appear in the Chinese character part.
(6) Every word in a paper with pinyin must be given a labeled pinyin.
<customizer: interphonic>
(pick) the lesser of two evils
ning4|ke3
reproof
fei1|nan4
Note: The total number of Chinese characters in the text of the test paper ranges
(0,200], the total number of characters ranges (0,5000], and the recommended
number of Chinese characters in the text ranges (0,100], and the recommended
number of characters (0,200].
Chinese Sentence (read_sentence)
Plain text example:
(1) Contents that can be included in the test paper: simplified Chinese characters,
traditional Chinese characters (within the range of gbk), 0-9 Arabic numerals (not
recommended), and separators.
(2) The content of the test paper may contain 0-9 Arabic numerals, but does not
support the content of the test paper is all Arabic numerals. Numerical values and
strings of numbers above two digits (e.g., year, phone number, time, etc.) are
required to be expressed in Chinese numerals.
(3) The number of Chinese characters in a sentence should not exceed 100.
This is an example of a Chinese statement review.
Example of pinyin labeling:
(1) Sentences are separated from each other using line breaks.
(2) What can be included in a test paper: simplified Chinese characters, pinyin, and
pinyin separator (|).
(3) Arabic numerals, English words and letters should not appear in the question
paper.
(4) Pinyin should be the correct pinyin in the dictionary, and the tone type should
be taken between 0-9, where 0/5/6/7/8/9 all represent soft tones.
(5) Use the "|" symbol to separate pinyin from pinyin in a sentence.
(6) The number of Chinese characters in a single line should not exceed 100.
(7) Every word in a paper with pinyin must be given a labeled pinyin.
<customizer: interphonic>
How's the weather today?
jin1|tian1|tian1|qi4|zen3|me5|yang4
Note: The total number of Chinese characters in the text ranges from (0,1000], the total number of characters ranges from (0,10000], the recommended number of Chinese characters in the text ranges from [5,500], and the recommended number of characters ranges from (0,1000].
Chinese Chapter (read_chapter)
Plain text example (same as the Sentence paper, except that the chapter is made up
of multiple sentences, see Sentence paper instructions for notes):
This is an example of a Chinese statement review.
Example of pinyin labeling:
<customizer: interphonic>
How's the weather today?
jin1|tian1|tian1|qi4|zen3|me5|yang4
Instructions for English test question format
English Words (read_word)
Ordinary text:
(1) Necessary node: [word], note the use of line breaks for separation.
(2) The number of words should not exceed 100.
(3) Word segmentation only supports tab key, enter line feed key and space bar.
(4) Symbols that can be supported for words: English half-width characters . - ' (i.e.
dot, hyphen, upper single quote), such as p.m and year-old can be supported,
hello,world is not supported.
(5) Unsupported punctuation for words: question marks, exclamation points,
semicolons, colons, commas, and illegal characters ( ) [ .
(6) Do not write punctuation as a separate word in the paper (i.e., punctuation with
spaces at both ends); the labeling will report an error.
[word]
apple
banana
Numerical readings are labeled:
(1) Must be marked with [number_replace] in the next line of the number.
(2) In the next line of [number_replace], the format of "number/reading/", note that
the number of symbols/ must be 2, and the content of // can not be added symbols.
[word]
13
[number_replace]
13/thirteen/
Note: The content of [word] node, prohibit any character not related to the content
of the word, affect the effect.
English user-defined phonetic symbols:
Users can add their own defined phonetic symbols to this node, and the engine will
evaluate the word according to the phonetic symbols added by the user, regardless
of what the word is really pronounced. It should be noted that when adding
customized phonetic symbols you need to make sure that they are correct IFLYTEK
phonetic symbols, not arbitrary ones; and it is not recommended to customize the
phonetic symbols of numbers under this node.
(1) Single word symbol/number not 2 error;
(2) Error Reporting with Null Word Phonetics (//);
(3) The number of bytes of a single phonetic symbol exceeds 128*6 bytes to report
an error;
(4) Multiple phonemes may be separated by the vertical line "|";
(5) At present, there is no symbol error detection function in this node, so symbols
can be added to the contents of //, but it is recommended that symbols other than
vertical lines and upper single quotes not be used;
[word]
lose
[vocabulary]
lose/l uw z/
English Sentence (read_sentence)
Ordinary text:
(1) Necessary node: [content], note the use of line breaks for separation.
(2) The content can be used with these four English half-width characters . ! ? ; to
make clauses.
(3) The three symbols ( ) [ should not appear before or in the middle of the text.]
(4) The character [ cannot appear at the end of the text, there can be only one (or ), not more than one (or )].
(5) Support full-width characters (a full-width character takes up two bytes, the
engine first turn full-width to half-width), accounting for the entire content node
content byte size should not exceed 10%.
(6) The size of unsupported characters in the bytes of the entire content node
should not exceed 10%, common unsupported characters such as @ , # , $ , % , & ,+ , { , }.
(7) The number of words per sentence cannot exceed 100, and the number of bytes
per sentence cannot exceed 1024 bytes (clause symbols are also counted as one
byte).
(8) The number of all words does not exceed 1000.
[content]
This is an example of sentence test.
With support for English half-width characters:
[content]
I don't know.
Numerical readings are labeled:
(1) The number of symbols/numbers in a single word is not 2 to report an error.
(2) Multiple readings of numbers are indicated by vertical lines separated by "|".
(3) The content must be in lowercase letters.
(4) The maximum replacement number length should not exceed 31.
[content]
I'm 13 years old.
[number_replace]
13/thirteen/
Note: If there is no special need, it is forbidden to add any information in the
CONTENT text that is not related to the content of the paper, and it is forbidden to
make changes to the words (such as long to l-o-n-g), which will have an impact on
the grading.
Description of non-essential nodes of sentence questions:
(1) Regarding [number_replace], the number of symbols in a single word is not 2.
(2) Regarding [number_replace], the replacement content is empty to report an
error (//).
(3) Regarding [number_replace], multiple readings of numbers are indicated by
vertical line "|" separation.
(4) Regarding [number_replace], the content must be in lowercase letters.
(5) About [number_replace], the maximum replacement number length should not exceed 31.
(6) Regarding [vocabulary], the number of individual word symbols/numbers is not 2.
(7) Regarding [vocabulary], the word phonetic symbol is null to report an error (//).
(8) Regarding [vocabulary], the number of bytes of a single phonetic symbol
exceeds 128*6 bytes.
(9) Regarding [vocabulary], polyphony can be separated by vertical lines.
English user-defined phonetic symbols:
(1) Single word symbol/number not 2 error;
(2) Error Reporting with Null Word Phonetics (//);
(3) The number of bytes of a single phonetic symbol exceeds 128*6 bytes to report
an error;
(4) Multiple phonemes may be separated by vertical lines;
(5) It is recommended that the contents of // be left unsigned;
[content]
I lose my pencil today.
[vocabulary]
lose/l uw z/
Tagging requires Symphony Audio, for a phonetic cross-reference see below:
Cybernetic Phonetic | Standard Phonetic | Cybernetic Phonetic | Standard Phonetic |
---|---|---|---|
aa | ɑː | f | f |
ae | æ | g | g |
ah | ʌ | hh | h |
ao | ɔː | jh | dʒ |
ar | eə | k | k |
aw | aʊ | l | l |
ax | ə | m | m |
ay | aɪ | n | n |
eh | e | ng | ŋ |
er | ɜː | p | p |
ey | eɪ | r | r |
ih | ɪ | s | s |
ir | ɪə | sh | ʃ |
iy | iː | t | t |
oo | ɒ | th | θ |
ow | əʊ | v | v |
oy | ɒɪ | w | w |
uh | ʊ | y | j |
uw | uː | z | z |
ur | ʊə | zh | ʒ |
b | b | dr | dr |
ch | tʃ | dz | dz |
d | d | tr | tr |
dh | ð | ts | ts |
English Chapter (read_chapter)
Example of a test paper:
(1) Necessary node: [content], note the use of line breaks for separation.
(2) The content can be used with these four English half-width characters . ! ? ; to
make clauses.
(3) The three symbols ( ) [ should not appear before or in the middle of the text. ]
(4) The character [ cannot appear at the end of the text, there can be only one (or ), not more than one (or ). ]
(5) Support full-width characters (a full-width character takes up two bytes, the
engine first turn full-width to half-width), accounting for the entire content node
content byte size should not exceed 10%.
(6) The size of unsupported characters in the bytes of the entire content node
should not exceed 10%, common unsupported characters such as @ , # , $ , % , & ,+ , { , }.
(7) The number of words per sentence cannot exceed 100, and the number of bytes
per sentence cannot exceed 1024 bytes (clause symbols are also counted as one
byte).
(8) The number of all words does not exceed 1000.
(9) Do not add meaningless combinations of characters in the text, such as numbers,
various combinations of letters and symbols, e.g. 7FH34J.
[content]
Hello,everybody.This is an example of chapter test.
Note: If there is no special need, it is forbidden to add any information in the
CONTENT text that is not related to the content of the paper, and it is forbidden to
make changes to the words (such as long to l-o-n-g), which will have an impact on
the grading.
English Situational Response (simple_expression)
Example of a test paper:
(1) Necessary nodes: [CHOICE], [KEYWORDS], note the use of line breaks for separation.
(2) The use of English half-width characters, ...!? ; five for clauses.
(3) The serial number of each option should be consecutive, and the serial number
and the content should be written in the form of "serial number + dot + space +
content".
(4) Any option needs to be displayed in one line, if the content of an option is
manually changed (except the system automatically changed), resulting in the
second line without a serial number, then an error will be reported.
(5) In front of each choice option text, don't appear ( ) [ these three characters in
the middle, it will report an error.
(6) One (or) can appear at the end of each choice option text, not more than one
(or).
(7) If you want to add full-width characters to the content of each choice option,
make sure that they do not take up more than 10% of the bytes of the content of
each choice node.
(8) If you want to enter unsupported characters in each choice option, make sure
that their size cannot be more than 10% of the content bytes of each choice node,
common unsupported characters are: @ , # , $ , % , ^, & , * , + , = , { , }.
(9) The number of words other than symbols may not exceed 100 for each CHOICE
option.
(10) If there is no special need, it is forbidden to add any characters that are not
related to the content of each choice option, including serial numbers, numbers,
arbitrary characters, etc. The above operation will have an impact on the labeling
and scoring.
[choice]
1. What should I do with the topic?
2. How can I deal with the topic?
3. What can I do with the topic?
4. What should I do with this subject?
5. How can I deal with this subject?
6. What can I do with this subject?
7. What should I do with this title?
8. How can I deal with this title?
9. What can I do with this title?
10. What should I manage this title?
11. How can I manage this title?
12. What can I manage this title?
13. What should I manage this subject?
14. How can I manage this subject?
15. What should I manage this topic?
16. How can I manage this topic?
17. What can I manage this topic?
18. How should I deal with this topic?
19. How should I deal with this title?
20. How should I deal with this subject?
[keywords]
what do topic | how deal topic | what do subject | how deal subject | what do title | how deal title | what manage title | how manage title | what manage subject | how manage subject | what manage topic | how manage topic
[script]
W: Congratulations, Tom! You gave a wonderful speech yesterday morning.
M: Thank you Mary.
W: I will give a speech next Wednesday in my English class, but I am not fully prepared yet. Can you give me some advice?
M: Sure. What's your topic?
W: Well, I am always concerned about environmental issues, so my topic is Environmental Protection.
M: This is a good topic, but it is too big.
[question]
How do I approach this topic?
[macanswer]
You have to narrow down your topic. For example, you may talk about what college students can do to protect our environment. After that, you need to do some research to collect relevant information as much as possible. Then, you should organize your arguments well. Logical organization is very important.
English multiple choice (read_choice)
Example of a test paper:
(1) Necessary nodes: [CHOICE], [KEYWORDS], note the use of line breaks for
separation.
(2) The use of English half-width characters, ...!? ; five for clauses.
(3) The serial number of each option should be consecutive, and the serial number
and the content should be written in the form of "serial number + dot + space +
content".
(4) Any option should be displayed in one line, if the content of an option is changed
to a new line, resulting in the second line without a serial number, then an error will
be reported.
(5) Each CHOICE option can support full-width characters that take up no more than
10% of the size of the content bytes of the entire CHOICE node.
(6) The size of the unsupported characters of each CHOICE option as a percentage
of the bytes of the entire content of the CHOICE node cannot exceed 10%.
(7) keywords content must be one of the choice options, and the correct option
content must be completely continuous match, missing content can not be (and
situational response question type choice node restrictions are different).
(8) Individual option answers may be separated by five English half-characters,...!? ;;
for clause breaks, multiple answers can be separated by a vertical line |.
(9) The number of words other than symbols may not exceed 100 for each CHOICE
option.
[choice]
1. Snakes.
2. Children.
3. Cats.
[keywords]
cats
[question]
What did the woman dislike?
English free-form questions (topic)
Example of a test paper:
(1) Necessary node: [topic], note the use of line breaks for separation.
(2) The first line of the recapitulation of the theme of the title, must be written in
the following manner: "serial number + point + space + content" way of writing,
such as 1. + title, must start from 1 according to the order of consecutive; note that
it must be a space, can not be the tab key or other characters, the title does not
appear ( ) [ the three characters, in addition, also do not in the title in the full-width characters , labeling will be an error.]
(3) The second line of the recapitulation of the content of the theme must also be
written in the following way: "serial number + dot + space + content" way of writing,
such as 1.1. + content, must start from 1.1.; note that it must be a space, not the tab
key or other characters.
(4) If there is more than one subject content, the serial id must be consecutive,
according to 1.1. , 1.2. , 1.3. this way.
(5) The use of English half-width characters, ...!? ; five for clauses.
(6) Any option needs to be displayed in one line, if the content of an option is
manually changed (except the system automatically changed), resulting in the
second line without a serial number, then an error will be reported.
(7) Non-essential nodes: [number_replace], [vocabulary] See Sentence Question
Type Non-essential Node Restriction for specification.
[topic]
1. The Goose Thief
1.1. Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised. Students were all fond of them. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students, and they all ran to the pond. The head teacher asked for an explanation. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!" The head teacher smiled and promised not to have them killed for the Christmas dinner.
[keypoint]
1. Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised.
2. Students were all fond of them.
3. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner.
4. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school.
5. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat.
6. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students,
7. They all ran to the pond.
8. The head teacher asked for an explanation.
9. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!"
10. The head teacher smiled and promised not to have them killed for the Christmas dinner.
English retelling questions (retell)
Example of a test paper:
(1) Necessary nodes: [topic], [keypoint], note use newline to separate.
(2) The first line of the recapitulation of the theme of the title, must be written in
the following manner: "serial number + point + space + content" way of writing,
such as 1. + title, must start from 1 according to the order of consecutive; note that
it must be a space, can not be the tab key or other characters, the title does not
appear ( ) [ the three characters, in addition, also do not in the The full-width characters in the title , the labeling will be wrong.]
(3) The second line of the recapitulation of the content of the theme must also be
written in the following way: "serial number + dot + space + content" way of writing,
such as 1.1. + content, must start from 1.1.; note that it must be a space, not the tab
key or other characters.
(4) If there is more than one subject content, the serial id must be consecutive,
according to 1.1. , 1.2. , 1.3. this way.
(5) The use of English half-width characters, ...!? ; five for clauses.
(6) Any option needs to be displayed in one line, if the content of an option is
manually changed (except the system automatically changed), resulting in the
second line without a serial number, then an error will be reported.
(7) Non-essential nodes: [number_replace], [vocabulary] See Sentence Question
Type Non-essential Node Restriction for specification.
[topic]
1. The Goose Thief
1.1. Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised. Students were all fond of them. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students, and they all ran to the pond. The head teacher asked for an explanation. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!" The head teacher smiled and promised not to have them killed for the Christmas dinner.
[keypoint]
1. Tom went to primary school in the countryside. Near his classroom, there was a small pond where two geese were raised.
2. Students were all fond of them.
3. One day, when Tom passed the school kitchen, he heard the cooks talking about killing the geese for the teachers' Christmas dinner.
4. Tom got angry, and said to himself, "I won't let them be eaten!" That night, Tom worked out a plan. He was going to hide them somewhere far away from the school.
5. The next morning, Tom went to school in his father's big coat. During the break, he rushed to the pond. Without anyone around, he caught the geese and pushed them inside the coat.
6. However, the geese were larger than he had thought, and they tried very hard to free themselves from the coat. The big noise caught the notice of the head teacher and the students,
7. They all ran to the pond.
8. The head teacher asked for an explanation.
9. Looking at the teacher with fear, Tom told the story and said, "It is unfair to them. We all love them!"
10. The head teacher smiled and promised not to have them killed for the Christmas dinner.
English Look and Talk (picture_talk)
Example of a test paper:
(1) Necessary node: [topic], note the use of line breaks for separation. See [topic] restriction in story retelling question type necessary node for specification.
(2) Non-essential nodes: [number_replace], [vocabulary] See Sentence Question Type Non-essential Node Restriction for specification.
(3) Regarding the non-essential node [keypoint], the serial number of each option should be consecutive, and the serial number and the content should be written in the way of "serial number + point number + space + content".
(4) Close the non-essential node at [keypoint], if there are more than one options under the keypoint node, just select the content of one of the options to slice it.
[topic]
1. Throw Litter
1.1. Mary and her classmates went outing last weekend. Someone was flying kites, some people were having snacks. There were litters on the road. Mary picked up the waste bottles and paper the put them in the dustbin. The teacher praised Mary for her good deed.
1.2. Last weekend, Mary went to the park with her classmates. They had a picnic in the park. Some people flew kites there. They had great fun there. Mary saw some rubbish on the road. She picked up the rubbish and threw it into the dustbin. The teacher praised Mary.
1.3. Last Saturday, Mary's class went to the park. They brought some food and had a picnic on the grass. After that, they flew kites there. Suddenly, Mary found that there was some rubbish on the road. She then picked up the rubbish and threw it into the dustbin. Mary's teacher saw this. She said "Well done" to Mary. Mary was very happy.
1.4. Mary went to the park with her friend last weekend. They had a picnic there, while some people were flying kites. Mary's friend wanted to fly a kite too. So she threw waste bottles and paper on the ground and ran away. Mary saw this and picked up the rubbish. Then she threw it into the garbage can. A woman noticed what Mary had done. She praised Mary for her good behavior.
1.5. Mary went to the park to have a picnic with her friend last Sunday. They brought some juice and bread as lunch. After lunch, they joined other people to fly kites. Mary saw some waste bottles and paper on the ground. Someone threw them away after having a picnic. Mary cleaned the road, putting the garbage into a garbage can. A lady saw this and praised Mary for what she had done.
1.6. Last weekend, Mary and her classmates went to the park. Some of them flew kites, and some of them had food on the grass. Mary brought some juice, bread and biscuits to share with her friend. After they finished eating, her friend went to fly a kite. Mary gathered their waste bottles and paper and was about to threw them into the dustbin. Suddenly, she saw some garbage on the ground. She picked up the garbage, and threw it away with their waste bottles and paper. Her good behavior was noticed by the manager of the park. The manager praised her.
1.7. Last weekend, Mary went outing with her classmates. Mary and her friend were having drinks and some bread. Others were flying kites or playing games. After a while, there were litters on the ground. Mary saw these and started to pick up all the waste paper and bottles. She put them into the dustbin. Mary's teacher praised her for what she had done.
1.8. Mary went for an outing with her classmates last weekend. Some people played games and some people went to fly kites. Mary and Lily were having some snacks. When they were about to play, Mary noticed that there were litters around them. So she picked up the waste bottles and paper and threw them in the dustbin. Just then, her teacher saw it and praised Mary for what she did.
1.9. The school held an outing last weekend. Mary and her classmates had fun there. Some people were playing games while some were flying kites. Mary and one of her classmates were having some snacks. Then, Mary found that there were some waste paper and bottles on the ground. So she threw all of them into the dustbin. At last, the ground became clean and Mary was praised by her teacher.
1.10. Mary and her classmates went for an outing last weekend. They were very happy. Someone was flying kites, some were having food. After having lunch, they went on playing games. Mary noticed that there were some litters on the ground. So she picked up all the litters and then put them in the dustbin. Mary's good deed was saw by her teacher. The teacher praised Mary and felt proud of what she had done.
1.11. Last Saturday, Mary's teacher took her class to an outing. The whole class were very happy then. Some people were flying kites while some were playing games. At lunch time, they had food and drank juice together. After that, there were some waste bottles and paper on the road. Mary started to pick them up and threw them into the dustbin. Her teacher saw it and spoke highly of what Mary had done. Mary felt very proud of herself.
1.12. Last weekend Mary and her classmates went outing and had a picnic. Some people were flying kites, some people were having snacks. Suddenly, they found there was a lot of litter on the road. Mary picked up the waste bottles and paper the put them in the dustbin. The teacher praised Mary for her good behavior.
1.13. Last weekend Mary went to the park with Some friends. Some of them were flying kites. Some friends were eating food. Suddenly, they saw there was some rubbish on the road. Mary picked up the rubbish and put it into the garbage. The teacher said Mary was good.
1.14. Last weekend Mary went to the park. Some classmates were flying kites, some classmate were eating food. Suddenly, they saw there was a lot of rubbish on the road. Mary picked up the rubbish and put it into the dustbin. The teacher said Mary was a good girl.
1.15. Last weekend Mary had a picnic with her cousins in the park. Some were flying kites, some were eating food. They saw there was some litter on the road. Mary picked up the litter and threw it into the dustbin. Her mother said Mary was good.
1.16. Last weekend Mary had a picnic with her cousins in the park. Some flew kites, some ate food. Suddenly, they saw someone dropped a lot of litter on the road. Mary picked up the litter and threw it into the dustbin. Her mother said Mary did a good job.
1.17. Last weekend, Mary went to the park for a picnic with her friend. They brought a lot of food and enjoyed it very much. Lily went to fly kite but she left many rubbish on the ground. Marry cleaned it and put it into the rubbish can. The teacher saw it and she said to Marry, "you are a good girl." What a good girl!
English oral translation(oral_translation)
Example of a test paper:
(1) Necessary node: [topic], note the use of line breaks for separation. See [topic] restriction in story retelling question type necessary node for specification.
(2) Non-essential nodes: [number_replace], [vocabulary] specification see Sentence Question Type Non-essential Node Restriction, [keypoint] specification see English Picture Seeing and Speaking Question Type Non-essential Node Restriction.
[topic]
1. British People
1.1. British people usually say "hello" or "nice to meet you" and shake your hand when they meet you for the first time. They behave politely in public. They think it's rude to push in before others. They always queue. They are very polite at home as well. When in Rome, do as the Romans do. When we are in a strange place, we should do as the local people do.
1.2. For the first meeting, the English will usually say "hello" or "nice to meet you" and shake hands with you. In the public places, they behave themselves well; they think that jumping in the line is a rude behavior, so they always line up. They are often very polite at home. When we are in a strange place, do in Rome as Rome does. We should behave well as local people.
1.3. When they meet for the first time, the British usually say "hello" or "nice to meet you", and shake hands with each other. In public, they behave themselves appropriately. They think it is impolite to jump the queue, and they always wait in line patiently for their turns. They are also very polite at home. As the saying goes, "when in Rome, do as the Romans do". When we are in a strange place, we should act as the locals do.
1.4. When first meet, English are likely to say "hello" or "nice to meet you" and shake hands with you. They behave well in public. They usually line up because they think queue jumping is very impolite. And they are also very polite at home. There is an old saying "Do in Rome as Rome does". So when we are in a new place, we should behave ourselves as the locals do.
1.5. When meeting for the first time, Englishmen usually say "hello" or "nice to meet you" with a handshake. They behave themselves well in public places. They regard jumping a queue as one of the rude behavior, so they always queue up. They are also very polite at home. When in Rome, do as the Romans do. When we are in a strange place, we should behavior just like the local people.
1.6. For the first meeting, English people usually say "Hello" or "Nice to meet you" and shake hands with you. In the public place, they also act very decently. In their views, it is very impolite to cut in line. They have formed a habit to wait in a queue. At home, they are also very polite. When in a strange place, we should do in Rome as the Romans do. Moreover, it is also polite that we behave like the local people.
1.7. In first meeting, the English often say "hi" or "nice to meet you!" and then shake hands with you. In public occasions, they behave mannerly. They think jumping a queue is impolite and they always line up. Also, they are polite at home. When in Rome do as the Romans do. When we are in a strange land, we should behave like the natives.
[vocabulary]
behavior /b ih 'hh ey v y ax/
uncourteous /,ah n 'k er t ir s/
Question type: read_syllable
Description of read_syllable hierarchy fields:
Properties | Annotations |
---|---|
phone_score | acoustic_score |
fluency_score | fluency_score |
tone_score | tone_score |
total_score | Total Score |
beg_pos/end_pos | start/end position (unit: frame, each frame equals to 10ms) |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
Sentence Hierarchy Field Description:
Properties | Annotations |
---|---|
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
beg_pos/end_pos | start/end position (unit: frame, each frame equals to 10ms) |
content | Question Paper Content |
word hierarchy field descriptions:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to10ms) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
syll hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (frame) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
phone hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace) |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
perr_msg | Error message: 1(Vocalization error) 2(Modulation error) 3(Vocalization modulation error), when dp_message is not 0, perr_msg may appear to be consistent with the dp_message value |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
Question type: read_word
Description of read_word hierarchical field:
Properties | Annotations |
---|---|
phone_score | acoustic_score |
phone_score | fluency_score | fluency_score (will return 0 for now) | | tone_score | tone_score | tone_score | tone_score | tone_score | tone_score | total_score | Total Score | | beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) | | content | Question Paper Content | | time_len | time length (unit: frame, each frame is equivalent to 10ms) |
Sentence Hierarchy Field Description:
Properties | Annotations |
---|---|
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) |
content | Question Paper Content |
word hierarchy field descriptions:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
syll hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
phone hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (unit: frame, each frame is equivalent to 10ms) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace) |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
perr_msg | Error message: 1(Vocalization error) 2(Modulation error) 3(Vocalization modulation error), when dp_message is not 0, perr_msg may appear to be consistent with the dp_message value |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
Question type: read_sentence
Description of read_sentence hierarchy field:
Properties | Annotations |
---|---|
phone_score | acoustic_score |
fluency_score | fluency_score |
tone score | tone type score |
total score | total score |
beg_pos/end_pos | start/end position (unit, frame, each frame equals to 10ms) |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
Sentence Hierarchy Field Description:
Properties | Annotations |
---|---|
phone_score | acoustic_score |
fluency_score | fluency_score |
tone_score | tone_score |
total_score | Total Score |
beg_pos/end_pos | start/end position (unit, frame, each frame equals to 10ms) |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
word hierarchy field descriptions:
Properties | Annotations |
---|---|
beg_pos/end_pos | start/end positions (frames) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
syll hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (frame) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non paper content) |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
phone hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (frame) |
dp_message | incremental message, 0 (correct) 16 (missed) 32 (incremental) 64(readback) 128 (replacement) |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
content | Question Paper Content |
perr_msg | Error message: 1 (wrong voice) 2 (wrong key pattern) 3 (wrong keypattern of voice), when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message |
time_len | time_length (unit, frames, each frame equals 10ms) |
Question type: read_chapter
Description of read_chapter hierarchy fields:
Properties | Annotations |
---|---|
phone_score | acoustic_score |
fluency_score | fluency_score |
tone_score | tone_score |
total_score | Total Score |
beg_pos / end_pos | start/end position (frame) |
content | Question Paper Content |
time_len | time_length (unit, frames, each frame equals 10ms) |
Sentence Hierarchy Field Description:
Properties | Annotations |
---|---|
phone_score | acoustic_score |
fluency_score | fluency_score |
tone_score | tone_score |
total_score | Total Score |
beg_pos / end_pos | start/end position (frame) |
content | Question Paper Content |
time_len | time_length (unit, frames, each frame equals 10ms) |
word hierarchy field descriptions:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (frame) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
syll hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (frame) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128(replace) |
symbol | Pinyin: the number represents the tone, 5 represents the soft tone |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
phone hierarchy field description:
Properties | Annotations |
---|---|
beg_pos / end_pos | start/end position (frame) |
dp_message | add miss message, 0 (correct) 16 (miss) 32 (add) 64 (readback) 128 |
(replace) | |
content | Question Paper Content |
rec_node_type | paper(paper content),sil(non-paper content) |
perr_msg | Error message: 1 (wrong voice) 2 (wrong key pattern) 3 (wrong key pattern of voice) , when dp_message is not 0, perr_msg may appear to be consistent with the value of dp_message |
time_len | time length (unit: frame, each frame is equivalent to 10ms) |
Learning engine xml output table one
Question type: read_word
Read_word layer description:
Properties | Annotations |
---|---|
beg_pos | Multiple word start boundary time |
content | Multi-word content |
end_pos | Multiple Word End Boundary Time |
accuracy_score | accuracy_score |
standard_score | standardized_score |
except_info | Exception Information |
is_rejected | whether or not it was rejected |
total_score | average of total scores for multiple words |
Sentence (sentence) level description:
Properties | Annotations |
---|---|
beg_pos | Multiple word start boundary time |
content | sentence content |
end_pos | end-of-sentence boundary time |
index | sentence index |
Description of the word layer
Properties | Annotations |
---|---|
beg_pos | word start boundary time |
content | word content |
end_pos | Word End Boundary Time |
dp_message | Word Increment Missed Message |
global_index | Word in chapter index |
index | Word in sentence index |
property | word properties |
total_score | word total |
pitch | word base frequency information (reserved field, don't need to care) |
pitch_beg | Word Base Frequency Beginning Value |
pitch_end | word base frequency end value |
werr_msg | Give result for wrong word (correct not output) |
Syll(syllable) layer description:
Properties | Annotations |
---|---|
beg_pos | Beginning of syllable boundary time |
content | syllabic content |
end_pos | syllable end boundary time |
serr_msg | syllable error message |
syll_accent | syllable repetition markers |
Phoneme layer description:
Properties | Annotations |
---|---|
beg_pos | Phoneme start boundary time |
content | phoneme content |
end_pos | phoneme end boundary time |
dp_message | phoneme incremental miss message |
Question type: read_ sentence
read_chapter (chapter) layer description:
Properties | Annotations |
---|---|
accuracy_score | accuracy_score |
beg_pos | chapter start time |
content | Chapter Content |
end_pos | end of chapter |
except_info | Exception Information |
fluency_score | fluency_score |
integrity_score | integrity_score |
standard_score | standard_score |
is_rejected | whether or not it was rejected |
total_score | Total Score |
word_count | number of words in the chapter |
Sentence level description:
Properties | Annotations |
---|---|
beg_pos | sentence start boundary time |
content | sentence content |
end_pos | end-of-sentence boundary time |
accuracy_score | accuracy_score |
fluency_score | fluency_score |
standard_score | standard_score |
index | sentence index |
score(replace with total_score) | total_score, struct (hidden) |
word_count | sentence all word count |
Word layer description:
Properties | Annotations |
---|---|
beg_pos | word start boundary time |
content | word content |
end_pos | Word End Boundary Time |
dp_message | word incremental omission message |
global_index | Word in chapter index |
index | word in sentence index |
property | word properties |
total_score | word total |
pitch | word base frequency information (reserved field, don't need to care) |
pitch_beg | Word Base Frequency Beginning Value |
pitch_end | word base frequency end value |
werr_msg | Give result for wrong word (correct not output) |
Syll(syllable) layer description:
Properties | Annotations |
---|---|
beg_pos | Beginning of syllable boundary time |
content | syllabic content |
end_pos | syllable end boundary time |
serr_msg | syllable error message |
syll_accent | syllable repetition markers |
Phoneme
Properties | Annotations |
---|---|
beg_pos | Phoneme start boundary time |
content | phoneme content |
end_pos | phoneme end boundary time |
dp_message | phoneme incremental miss message |
Question type: read_chapter
read_chapter (chapter) layer description:
Properties | Annotations |
---|---|
accuracy_score | accuracy_score |
beg_pos | chapter start time |
content | Chapter Content |
end_pos | end of chapter |
except_info | Exception Information |
fluency_score | fluency_score |
integrity_score | integrity_score |
standard_score | standard_score |
is_rejected | whether or not it was rejected |
total_score | Total Score |
word_count | number of words in a chapter |
Sentence (sentence) level description:
Properties | Annotations |
---|---|
beg_pos | sentence start boundary time |
content | sentence content |
end_pos | end-of-sentence boundary time |
accuracy_score | accuracy_score |
fluency_score | fluency_score |
standard_score | standard_score |
index | sentence index |
score(replace with total_score) | total_score, struct (hidden) |
word_count | sentence all word count |
Word layer description:
Properties | Annotations |
---|---|
beg_pos | word start boundary time |
content | word content |
end_pos | Word End Boundary Time |
dp_message | word incremental omission message |
global_index | Word in chapter index |
index | word in sentence index |
property | word properties |
total_score | word total |
werr_msg | Give result for wrong word (correct not output) |
Syll(syllable) layer description:
Properties | Annotations |
---|---|
beg_pos | Beginning of syllable boundaries |
content | syllabic content |
end_pos | syllable end boundary time |
serr_msg | syllable error message |
syll_accent | syllable repetition markers |
Phoneme layer description:
Properties | Annotations |
---|---|
beg_pos | Phoneme start boundary time |
content | phoneme content |
end_pos | phoneme end boundary time |
Type of question: topic (free-response questions in English)
Description of rec_paper layer:
Properties | Annotations |
---|---|
accuracy_score | semantic accuracy score |
beg_pos | start time of reading aloud |
content | read aloud recognized content |
end_pos | end time of reading aloud |
except_info | Exception Information |
phone_score | Pronunciation accuracy score |
speeking_speed | speed of speech (typically 140-200 words per minute) |
total_score | Total Score |
Sentence Layer Description:
Properties | Annotations |
---|---|
content | sentence content |
index | sentence index |
Word Layer Description:
Properties | Annotations |
---|---|
beg_pos | Word Start Boundary Time |
content | word content |
end_pos | Word End Boundary Time |
Question type: simple_expression (English situational response)
rec_paper layer description:
Properties | Annotations |
---|---|
beg_pos | start time of reading aloud |
content | read aloud recognized content |
end_pos | end time of reading aloud |
except_info | Exception Information |
phone_score | Pronunciation accuracy score |
total_score | Total Score |
Sentence Layer Description:
Properties | Annotations |
---|---|
content | sentence content |
index | sentence index |
word layer description:
Properties | Annotations |
---|---|
beg_pos | Word Start Boundary Time |
content | word content |
end_pos | Word End Boundary Time |
Question type: read_choice (English multiple choice)
free_choice layer description:
Properties | Annotations |
---|---|
beg_pos | start time of reading aloud |
content | read aloud recognized content |
end_pos | end time of reading aloud |
except_info | Exception Information |
total_score | Total Score |
Learning engine xml output table II
Notes and additional information
Precautions | Description |
---|---|
is_rejected return field (some assessment question types do not have this field returned) | true: rejected, indicating that the engine detected garbled reads and that the score cannot be used as a reference false: normal |
Standardized Degree Scores in Word, Sentence, and Chapter Question Types | Standardized Degree Scores only if the number of words in the text is >= 5. |
Gambling Detection for Word, Sentence, and Chapter Question Types | Gambling Detection is only available if the number of words in the text is >=5. (There is currently no gibberish detection for free-form questions.) |
except_info attribute value | except_info=28673 in hexadecimal is 0x7001, indicating that the engine judges the voice to be of no voice or low volume type except_info=28676 in hexadecimal is 0x7004, indicating that the engine judges the voice to be of gibberish type <br> except_info=28680, hexadecimal is 0x7008, means the engine judges the voice as low signal-to-noise ratio type except_info=28690, hexadecimal is 0x7012, means the engine judges the voice as truncated type except_info=28689 When accept_info=28689, the hexadecimal value is 0x7011, which means the engine judges that there is no audio input, please check if the audio or recording equipment is normal |
dp_message attribute value | dp_message=0 means that the engine judges that the word or phoneme was read normally dp_message=16 means that the engine judges that the word or phoneme was missed dp_message=32 means that the engine judges that the word or phoneme was incremented |
property, werr_msg property (effect optimization, no need to pay attention to) | werr_msg property will appear only when the engine judge the word read wrong,for example, the word property = 16, that the word at the need to read, if the xmlappears property werr_msg = 512, it shows that the engine judge the voice of thisword at the word is not even read! Otherwise, the engine reads the word correctly. Consecutive reading: property=16; werr_msg=512 Repetition:property=32; werr_msg=2048 End-of-sentence intonation and intonation:property=64; werr_msg=4096 Intentional pauses: property=2; werr_msg=2;werr_msg=4096 Implied pauses. 2; werr_msg=256 Half-sentence:property=12, when text words are followed by a single comma sign, property=12,this is the engine's clause marker, this property appears for words before the clause-comma sign within a sentence, indicating that it's a marker for a half-sentence, with no special meaning. |
serr_msg attribute | serr_msg=0, the engine judges that the syllable is read correctly serr_msg=1, the engine judges that the syllable is read incorrectly serr_msg=2048, the syllable needs to be reread but the engine judges that the voice has not been reread (at this time, syll_accent is 1, and we recommend not to pay attention to this situation) serr_msg=2049, the syllable needs to be reread but the engine judges that the voice has not been reread and we recommend not to pay attention to this situation. accent=1, effect optimization,it is not recommended to pay attention to this situation) serr_msg=2049, itmeans that the syllable needs to be read again (zhong) but the engine judges thatthe voice is not reread and the syllable is read wrongly (at this time, the syll_accent should be 1, effect optimization, it is possible not to pay attention to this case) |
syll_accent attribute | syll_accent=0 means this syllable does not need to bereread syll_accent=1 means this syllable needs to be reread |
Some of the pinyin papers, e.g., When <zai4>Da<da2>Rui was eight years old,one day he wanted to <xiang3>go to the movies. | Add pinyin labeling for no more than one-third of the number of characters in the entire paper. |
some question types such as words, phrases, sentences, chapters at the syll level,phone level | content paper content occurs (sil and silv for silence, fil for noise) |
gwpp, pitch, reject_type, no_plo_word, dur_value, magnitude_value, pitch_value,score_pattern | These fields are reserved fields returned by the model and are of no concern |
Error code
Error code | Error code description |
---|---|
10163 | Parameter validation failure, caused by a client parameter validation failure,the client needs to change the request parameters based on the description in the returned message field |
10313 | Request parameter No app_id passed in first frame or app_id passed does not match api_key. |
40007 | Audio decoding failed, please check if the transmitted audio corresponds to the encoding format described in the encoding field. |
40007 | Audio decoding failed. |
11201 | Interface usage has exceeded the maximum limit for purchase. |
11201 | Interface Usage Exceeds Maximum Purchase Limit |
10114 | Request timed out, session time exceeded 300s, please control the session time to keep it no longer than 300s |
10043 | Audio decoding failed, please make sure the transmitted audio encoding format is consistent with the request parameters. |
10043 | Audio decoding failed. |
10161 | base64 decoding failed, check if the sent data is encoded in base64 |
10200 | Read data timeout, check if no data has been sent for a total of 10s and the connection has not been closed |
10160 | Illegal request data format, check if request data is legal json |
11200 | Function unauthorized |
60114 | Review Audio Length Too Long |
10139 | Parameter error |
48196 | instance prohibits repeated calls to this interface |
40006 | Invalid parameter |
40010 | no response |
40016 | Initialization failure |
40017 | Not initialized |
40023 | Invalid configuration |
40034 | Parameter not set |
40037 | no review text |
40038 | no review voice |
40040 | Illegal data |
42306 | Insufficient authorization |
68676 | Nonsense |
30002 | ssb no cmd parameter |
48195 | Example assessment test paper is not set, the format of the test questions is wrong, please check whether the assessment text matches with the test questions,especially the English question types need to add special markings in the test questions, not set ent, category and other parameters, etc |
30011 | sid is empty, if upload audio is not set aus |
68675 | Unusual voice data, check for 16k, 16bit, mono audio, and check that the aue parameter value designation matches the audio type |
48205 | Example not evaluated, e.g. no recordings fetched, error due to empty ploaded audio |
Demos
Note: demo is just a simple call example, not suitable for direct use in complex and changing production environments
Pronunciation Assessment Streaming API demo java language
Pronunciation Assessment Streaming API demo python3 language
Pronunciation Assessment Streaming API demo go language
Pronunciation Assessment Streaming API demo nodejs language
Pronunciation Assessment Streaming API demo C# language
Frequently Asked Questions
What are the scoring criteria for Pronunciation Assessments?
Dimensions | Percentage of Adults | Percentage of Elementary School Students | A (9-10 points) |
---|---|---|---|
Accuracy | 50% | 60% | Word pronunciation accurate and clear |
Fluency | 30% | 30% | Reads aloud fluently, speaks at a normal pace, and exhibits essentially no pauses, repetitions, self-corrections, etc. |
Standardness (including emotion) | 20% | 10% | Pronunciation habits in line with native English standards (no Chinese accent), flexible use of pronunciation techniques such as alliteration, repetition, dissonance, and bursting, good rhythm, and full of emotion |
How many concurrent channels does the Pronunciation Assessment Web api support?
Answer: 50-way concurrency supported by default
How long does Pronunciation Assessment support voice input at most?
A: For all assessment question types, it is recommended to use voice input for less than 3 minutes, if the audio sending session lasts more than 5 minutes it will report error 10114 or 60114 error.
What are the audio requirements for Pronunciation Assessment support?
A: The audio sampling rate is 16k, sampling precision 16 bit, mono audio. For sample audio, please refer to the audio provided in the java demo.
What is the difference between the new streaming version of the review and the previous regular version of the review (which has been taken offline)?
A: The main differences are:
1, the new version of the streaming evaluation has adopted a new structure, in product features, evaluation results, service stability and other aspects of the overall superiority of the ordinary version of the evaluation;
2, the new version of the streaming assessment support more question types, in addition to the ordinary version of the support of words and phrases, chapters and other question types, but also supports such as the English situation response, free to say, look at the picture to speak, oral composition and other question types (note that such question types need to be combined with the customization of the test paper service, please check the product details page for the corresponding package description);
3, the new version of the streaming evaluation using the new architecture, temporarily only support the return of xml format results, json format will be supported in the near future, stay tuned;
4, the new version of the streaming version of the evaluation using websocket protocol, the ordinary version of the evaluation is based on http protocol, access to different ways, please refer to the development of detailed documentation and sample code integration development.
How do MSCs with the old SDK for Pronunciation Assessment (Normal Edition) switch to use the Pronunciation Assessment (Streaming Edition) interface capability?
Answer: The parameters need to be modified as follows:
1. Set the mandatory parameter sub=ise;
2. The Chinese setting must transmit the parameter ent=cn_vip, and the English setting must transmit the parameter ent=en_vip;
3. Add the two mandatory parameters as above to complete the use of the Pronunciation Assessment (streaming version) interface capability;
How to solve the problem of getting high marks for talking and reading nonsense*
A: The evaluation result will give is_rejected field, when the value of the field is true, it means that at this time it is a rejection caused by the user's nonsense, and the developer can judge whether the user is nonsense this time according to this field.If the engine reports nonsense, then it can be assumed that the scoring has become untrustworthy. The cause of the garbled reading can be initially determined from the except_info attribute value.