backend.py
)
Public calls (may require password for protected lexicons): | |
modes | see the hierarchy of the modes |
groups | see the available index modes and their components |
modeinfo | see the available search fields of a mode |
lexiconinfo | see the available search fields of a lexicon |
lexiconorder | for retrieving the lexicon order |
query | for normal querying |
querycount | for querying with some statistics |
minientry | for getting minientries |
statistics | for getting statistics, aggregation |
statlist | for getting statistics, table view |
autocomplete | for autocompletion of lemgrams |
saldopath | for showing the path from a saldo sense to PRIM |
getcontext | for showing the (alphabetical) neighbours of an entry |
explain | for normal querying |
random | for retrieving a random lexical entry |
suggest | make an update suggestion |
suggestnew | suggest a new entry |
checksuggestion | see the status of a suggestion |
Password protected calls: | |
delete | delete an entry by #ID |
mkupdate | update a lexical entry |
add | add a lexical entry |
readd | add an entry which already has an id (one that has been deleted) |
addbulk | add multiple entries |
add_child | add an entry and link it to its parent |
checksuggestions | view suggestions |
acceptsuggestion | accept a suggestion |
acceptandmodify | accept a suggestion after modifications have been made to it |
rejectsuggestion | reject a suggestion |
checkuser | for checking whether the user is ok |
checkuserhistory | for retrieving the edit history of a user |
checkhistory | for retrieving the edit history of an entry |
checklexiconhistory | for retrieving the edit history of one or more lexicon |
checkdifference | see the difference between two versions |
export | export a lexicon |
q=simple||satt och
q=extended||and|field|operator|value | (positive) |
q=extended||not|field|operator|value | (negative) |
Example:
search all entries with a word form that is "blomma" or "äpple" and that has no part of speech tag:
q=extended||and|pos|missing||and||wf|equals|blomma|äpple
Available query parameters:
'q': | the query |
'mode': | which mode to search in. default: karp |
'resource': | one or more comma separated lexicon to search. default: all |
'size': | number of hits to show on each page. default: 25 |
'start': | the index of the first hit to show. default: 0 |
'sort': | one or more comma separated fields to sort on. default: depends on mode, usually lexiconOrder, score, baseform, lemgram |
'format': | get the result in another format. which options that are available depends on lexicon and mode. examples: xml, lmf, tsb, tab, csv, app, tryck. |
Asking for another format will not return the json objects. For saol, the result will be
{"hits": { "hits": a xml string ... } }
For other resources, the result will be
{"formatted": a string in the current format ... }
NB. The structure of the result for format-posts might change!
Examples:
http://ws.spraakbanken.gu.se/query?q=simple||stort hus&mode=stablekarp
http://ws.spraakbanken.gu.se/query?q=simple||flicka&mode=saldogroup&resource=saldo,saldom,saldoe
http://ws.spraakbanken.gu.se/query?q=extended||and|lemgram|startswith|dalinm--
http://ws.spraakbanken.gu.se/query?q=extended||and|wf|equals|äta||and|pos|missing&resource=kelly,lwt&mode=stablekarp
http://ws.spraakbanken.gu.se/query?q=extended||and|wf|regexp|.*o.*a&mode=bliss
http://ws.spraakbanken.gu.se/query?q=extended||and|wf|regexp|.*o.*a&start=25&mode=bliss
http://ws.spraakbanken.gu.se/query?q=extended||and|sense|exists&mode=historic_ii
http://ws.spraakbanken.gu.se/query?q=extended||and|wf|equals|sitta||not|wf|equals|satt&mode=external
http://ws.spraakbanken.gu.se/query?q=simple||muminfigurer&mode=term-swefin&format=csv
Result:
hits{ total : number of hits hits: a list with information about the hits hits[n]._source: all information in hit n hits[n]._source._id elasticsearch's identifier for the entry hits[n]._version: the current version of this entry hits[n]._source.lexiconName hits[n]._source.lexiconOrder hits[n]._source.FormRepresentations hits[n]._source.Sense hits[n]._source.WordForms hits[n]._source.ListOfComponents hits[n]._source.RelatedForm hits[n]._source.compareWith hits[n]._source.entryType hits[n]._source.saldoLinks hits[n]._source.see hits[n]._source.symbolCenter hits[n]._source.symbolHeight hits[n]._source.symbolPath hits[n]._source.symbolWidth For freetext search (simple||...) only: hits[n].highlight information and paths about the matching part of the entry }
Example:
Result:
hits{...} as above distribution a list of counts for each lexicon that contains at least one match distribution[n].key the order for the n:th lexicon distribution[n].doc_count the count for the n:th lexicon distribution[n].lexiconName.buckets.[0].doc_count the count for the n:th lexicon (again) distribution[n].lexiconName.buckets.[0].key the name of the n:th lexicon
Available query parameters:
'mode': | which mode to search in. default: karp |
'q': | the query |
'resource': | one or more comma separated lexica to search. default: all |
'show': | one or more comma separated fields to show. default: depends on mode, usually lexiconName, lemgram, baseform |
'size': | number of hits to show on each page. default: 25 |
Example:
http://ws.spraakbanken.gu.se/minientry?q=extended||and|wf|equals|sitta|ligga
http://ws.spraakbanken.gu.se/minientry?q=extended||and|writtenForm|equals|får||and|writtenForm|equals|fick&show=pos
Result: See query.
Available query parameters:
'q': | the query |
'mode': | which mode to search in. default: karp |
'resource': | one or more comma separated lexica to search. default: all |
'size': | number of hits to show in each bucket. default: 100 |
'buckets': | one or more comma separated fields to group the results by. default: lexiconName, pos |
'cardinality': | shows the cardinality number of values for the innermost of the requested buckets, instead of showing the actual values. Not compatible with 'q'. 'size' will be ignored. |
Example:
http://ws.spraakbanken.gu.se/statistics
http://ws.spraakbanken.gu.se/statistics?q=simple||kasusformer&mode=karp
http://ws.spraakbanken.gu.se/statistics?resource=hellqvist&mode=historic_ii
http://ws.spraakbanken.gu.se/statistics?buckets=pos.bucket,sense.bucket&size=200&mode=stablekarp
The result is not sorted.
Result: X is the name of the first bucket (default: lexiconName), Y (defaults to pos) the second and so on. For a search where a query or a resource is specified:
aggregations { q_statistics.doc_count : total number of hits q_satistics.X: information about the data grouped by X (the first bucket) q_satistics.X_missing: information about the data missing X q_satistics.X.buckets[n].key: X value q_satistics.X.buckets[n].doc_count: number of hits within the X value q_satistics.X.buckets[n].Y: information grouped by X and then Y q_satistics.X.buckets[n].Y.doc_count: number of hits within the Y value in X q_satistics.X_missing.buckets[n]: information about entries which do not have any X value q_satistics.X_missing.buckets[n].doc_count: number of hits that do not have any X value q_satistics.X.buckets[n].Y_missing: information grouped by X and then Y, showing cases without any value for Y q_satistics.X.buckets[n].Y.doc_count: number of hits within the Y value in X .... }
Available query parameters:
'q': | the query |
'mode': | which mode to search in. default: karp |
'resource': | one or more comma separated lexica to search. default: all |
'buckets': | one or more comma separated fields to group the results by. default: lexiconName, pos |
'size': | number of hits to show in each bucket. Does hence not correspond to the number of table rows. default: 100. |
Example:
http://ws.spraakbanken.gu.se/statlist
http://ws.spraakbanken.gu.se/statlist?q=simple||ärt&buckets=resource,lemgram.bucket
http://ws.spraakbanken.gu.se/statlist?resource=hellqvist
http://ws.spraakbanken.gu.se/statlist?mode=historic_i&buckets=pos.bucket
http://ws.spraakbanken.gu.se/statlist?buckets=pos.bucket&size=200&mode=stablekarp
Result:
{ "stat_table": [ [ "konstruktikon", "", 88 ], [ "saldom", "nn", 3 ], ... ] }
It does not match prefixes. Searching for "sig" does hence not give suggestions like "sigill" or "signatur".
Provides lemgram suggestions to Korp, by looking in mode 'external'.
Examples:
http://ws.spraakbanken.gu.se/autocomplete?mode=external&q=sig
http://ws.spraakbanken.gu.se/autocomplete?multi=kasta,docka&resource=saldom&mode=external
http://ws.spraakbanken.gu.se/autocomplete?q=kasus&resource=saldom,dalin,hellqvist
http://ws.spraakbanken.gu.se/autocomplete?q=kasta&resource=saldom
Available query parameters:
'q': | the query, a word form |
'multi': | a comma separated list of queries (word forms). do not use together with q |
'resource': | one or more comma separated lexica to search. default: all |
'mode': | which mode to search in. default: karp |
The result is not sorted and one lemgram may occur multiple times
Result:
hits{ total : number of hits hits : information about the hits hits[n]._source : information about hit n hits[n]._source.FormRepresentations.lemgram : lemgram }If 'multi' parameter is used, the output will be a dictionary with one key corresponding to every input word. The values will be the same format as for q:
{"kasta": {"hits": ... }, "docka": {"hits": ... } }
Example:
Result:
{ path: [input_sense..1, ... , PRIM..1] }
The sorting order is based on the mode configs. (The order must be strict, eg. no two words may have the same score. If they do, getcontext will not work properly.)
Example:
http://ws.spraakbanken.gu.se/getcontext/#lexicon?center=#ID
http://ws.spraakbanken.gu.se/getcontext/saldo?q=extended||and|pos|equals|nn&size=2
http://ws.spraakbanken.gu.se/getcontext/saol?q=extended||not|ptv|equals|true&size=2
Available query parameters:
'center': | the ES-ID of the entry to center the search around. default: the first entry |
'q': | an optional query to restrict entries that appear in the result |
'size': | number of hits to show on each side of the center word. default: 10 |
{ center: [ *the centered entry* ], pre: [ *a list of hits (that match the query) occurring immediately before the centered * ], post: [ *a list of hits (that match the query) occurring immediately after the centered entry *]}
Example:
Result:
ans the normal query result elastic_json_query the query translated to Elastic's api explain Elastic's result to a _validate/query?explain query.
Result:
{ "lexiconA": 1, "lexiconB": 3, "lexiconC": 8, ... }
Available query parameters:
'resource': | one or more comma separated lexica to search. default: all |
'mode': | which mode to search in. default: karp |
'show': | one or more comma separated fields to show. default: depends on mode, usually lexiconName, lemgram, baseform |
'show_all': | all fields of the entries will be shown if this flag is set to true (or any value). Overrides show |
'size': | number of hits to show on each page. default: 1 |
Example:
The field 'version' is optional, but will prevent an old version to later override a newer one. Example:
http://ws.spraakbanken.gu.se/suggest/#lexicon/#ID -d {"message" : ..., "user": ..., "version": ... }
http://ws.spraakbanken.gu.se/suggestnew/#lexicon -d {"message" : ..., "user": ..., "version": ... }
Result:
{ "es_ans": {...} // output from ES "es_loaded": 1, // 1 if the suggestion is stored in ES "id": // the #ID of the suggestion. Can be used to see the current status, accept or reject the suggestion "sql_loaded": 1, // 1 if the suggestion is stored in SQL "suggestion": true, sql_error // present if there were errors storing the suggestion }
Examples:
http://ws.spraakbanken.gu.se/delete/#lexicon/#ID
Result:
{ 'sql_loaded' : 1 if successfully marked as deleted in the SQL database, 'es_loaded' : 1 if successfully deleted from ElasticSearch (is no longer searchable), 'es_ans' : the answer from ES. }
XPOST http://ws.spraakbanken.gu.se/mkupdate/#lexicon/#ID -d '{'doc' : updated entry, 'version' : (last) version, 'message' : update message}'
XPOST http://ws.spraakbanken.gu.se/mkupdate/#lexicon/#ID -d '{'doc' : updated entry, 'version' : (last) version, 'message' : update message}'
{ 'sql_loaded' : 1 if successfully saved in the SQL database, 'es_loaded' : 1 if successfully stored in ElasticSearch (is searchable), 'es_ans' : {'_id':..., '_index':..., '_type':..., '_version': ...} //the answer from ES. }
Error messages:
Version conflict:
{"message": "Database exception: Error during update. Message: TransportError(409, u'RemoteTransportException[...]; nested: VersionConflictEngineException[...]: version conflict, current [3], provided [1]]; ')."}ID could not be found:
{"message": "Database exception: Error during update. Message: TransportError(404, u'RemoteTransportException[...]; nested: DocumentMissingException[...]: document missing]; ')"}
XPOST http://ws.spraakbanken.gu.se/add/#lexicon -d '{'doc' : {...,'lexiconName': 'saldo',lexiconOrder' : 0,...} 'version' : (last) version, 'message' : update message}'
XPOST http://ws.spraakbanken.gu.se/readd/#lexicon/#ID -d '{'doc' : {...,'lexiconName': 'saldo',lexiconOrder' : 0,...} 'version' : (last) version, 'message' : update message}'
Result:
{ 'sql_loaded' : 1 if successfully saved in the SQL database, 'es_loaded' : 1 if successfully stored in ElasticSearch (is searchable), 'es_ans' : {'_id':..., '_index':..., '_type':..., '_version': ..., 'created' : True} //the answer from ES. , 'suggestion': False. True if the update has been treated as suggestion. }
XPOST http://ws.spraakbanken.gu.se/addbulk/#lexicon -d '{'doc' : [{...entry1...}, {...entry2...}, ...], 'message' : update message}'
{ 'sql_loaded' : number of entries successfully saved in the SQL database, 'es_loaded' : number of entries successfully stored in ElasticSearch, 'ids' : a list of the new entries IDs, 'suggestion': False }
XPOST http://ws.spraakbanken.gu.se/addchild/#lexicon/#parentid -d '{'doc' : {...,'lexiconName': 'saldo',lexiconOrder' : 0,...} 'version' : (last) version, 'message' : update message}'
Result:
{ 'parent': the result of adding the link to the parent (see mkupdate). 'child' : the result of adding the child (see add). }
Required query parameters:
'resource': | one or more comma separated lexicons to search. default: all |
Available query parameters:
'size': | the number of suggestions to view (order by decreasing date). default: 50 |
'status': | waiting, rejected, accepted. default: all |
Example:
http://ws.spraakbanken.gu.se/checksuggestions?resource=konstruktikon&size=2&status=waiting
Result:
{ "updates": [ { "acceptmessage" // is set when the suggestion is accepted or rejected "date": // the date of suggestion "doc": // the suggested lexical entry "id": // the #ID of the suggestion "lexicon": // the lexicon it belongs to "message": // message from the suggester "origid": // the #ID of the entry the suggestion concerns "status": // the status of the suggestion (waiting, accepted or rejected) "user": // the name or email adress of the suggester "version": // the version of the entry that the suggestion concerns } ]
Example:
http://ws.spraakbanken.gu.se/checksuggestion/#lexicon/#IDThe #ID refers to the suggestion, not the original entry.
Result:
{ "updates": [ { "acceptmessage" // is set when the suggestion is accepted or rejected "date": // the date of suggestion "doc": // the suggested lexical entry "id": // the #ID of the suggestion "lexicon": // the lexicon it belongs to "message": // message from the suggester "origid": // the #ID of the entry the suggestion concerns "status": // the status of the suggestion (waiting, accepted or rejected) "user": // the name or email adress of the suggester "version": // the version of the entry that the suggestion concerns } ]
Example:
http://ws.spraakbanken.gu.se/acceptsuggestion/#lexicon/#ID -d {"message" : ...}The #ID refers to the suggestion, not the original entry. The message will be stored in the suggestion data base and in the live data base.
Result:
{ "es_ans": { "_id": // #ID of the updated entry "_index": "_type": "_version": }, "es_loaded": // 1 if successfully loaded to ES "sql_loaded": // 1 if successfully loaded to the live SQL "sugg_db_error": // present if there were errors storing the suggestion "sugg_db_loaded": // 1 if successfully loaded to suggestion SQL "sugg_es_ans": {"es_ans" : {...} // ans from ES ,"es_loaded": // 1 if removed from the suggestion ES ,"sql_loaded": // 1 if the suggestion was marked as accepted } }
Example:
http://ws.spraakbanken.gu.se/acceptandmodify/#lexicon/#ID -d {"doc": {...} "message" : ...}The #ID refers to the suggestion, not the original entry. The data is the new version that should be kept. The message will be stored in the suggestion data base and in the live data base.
Result:
{ "es_ans": { "_id": // #ID of the updated entry "_index": "_type": "_version": }, "es_loaded": // 1 if successfully loaded to ES "sql_loaded": // 1 if successfully loaded to the live SQL "sugg_db_error": // present if there were errors storing the suggestion "sugg_db_loaded": // 1 if successfully loaded to suggestion SQL "sugg_es_ans": {"es_ans" : {...} // ans from ES ,"es_loaded": // 1 if removed from the suggestion ES ,"sql_loaded": // 1 if the suggestion was marked as accepted } }
Example:
http://ws.spraakbanken.gu.se/rejectsuggestion/#lexicon/#ID -d {"message" : ...}The #ID refers to the suggestion, not the original entry. The message will be stored in the suggestion data base.
Result:
{ "es_ans": {...} // output from the deletion from the suggestion ES "es_loaded": // 1 if successfully removed to the suggestion ES "sugg_db_error": // present if there were errors storing the suggestion "sugg_db_loaded": // 1 if successfully loaded to suggestion SQL }
Result:
{ "authenticated": is the user name and password ok, "permitted_resources.lexica": lexicons that the user may see or edit }
'size': | number of hits to show on each page. default: 10 |
Result:
{ "updates": [ {"date" ,"doc" //the entry that has been edited ,"id" ,"message" ,"user" }, ... ] }
'size': | number of hits to show on each page. default: 10 |
Result:
{ "updates": [ {"date" ,"doc" //the entry that has been edited ,"id" ,"message" ,"user" }, ... ] }
'size': | number of hits to show on each page. default: 10 |
Examples. If a date is provided (in the correct format) only updates done later than this is shown.
http://ws.spraakbanken.gu.se/checklexiconhistory/#lexicon
http://ws.spraakbanken.gu.se/checklexiconhistory/blissword/20150922
Result:
{ "updates": [ {"date" ,"doc" //the entry that has been edited ,"id" ,"message" ,"user" ,"type" // CHANGED, ADDED or REMOVED, only present if checklexiconhistory is called }, ... ] }
Example.
http://ws.spraakbanken.gu.se/checkdifference/#lexicon/#ID/latest
http://ws.spraakbanken.gu.se/checkdifference/#lexicon/#ID/latest/#fromdate'
http://ws.spraakbanken.gu.se/checkdifference/#lexicon/#ID/#fromdate/#todate'
Result:
{ "diff" : [ {"field" // a field that has been changed between the two versions ,"after" // the content of the field in the later of the two versions ,"before" // the content of the field in the older of the two versions (not present if the field is added) "type" // added, changed och removed } ]}
Available parameters:
'date': | export the entries as they were a given date. default: latest |
'export': | export to another format than json, eg csv, tsv, xml. Not available for all lexicons! default: json |
'size': | number of hits to show. default: all entries |
Example.
http://ws.spraakbanken.gu.se/export/#lexicon
http://ws.spraakbanken.gu.se/export/#lexicon?date=20170901
http://ws.spraakbanken.gu.se/export/#lexicon?format=lmf&size=2
Result (if json):
{ "#lexicon" : [ ... the lexicon ... ] }