Conference Object

Query Translation for Cross-lingual Search in the Academic Search Engine PubPsych

Author(s) / Creator(s)

España-Bonet, Cristina
Stiller, Juliane
Ramthun, Roland
van Genabith, Josef
Petras, Vivien

Abstract / Description

We describe a lexical resource-based process for query translation of a domain-specific and multilingual academic search engine in psychology, PubPsych. PubPsych queries are diverse in language with a high amount of informational queries and technical terminology. We present an approach for translating queries into English, German, French, and Spanish. We build a quadrilingual lexicon with aligned terms in the four languages using MeSH, Wikipedia and Apertium as our main resources. Our results show that using the quadlexicon together with some simple translation rules, we can automatically translate 85% of translatable tokens in PubPsych queries with mean adequacy over all the translatable text of 1.4 when measured on a 3-point scale [0,1,2].

Keyword(s)

machine translation information retrieval

Persistent Identifier

Date of first publication

2018

Is part of

12th International Conference on Metadata and Semantics Research, 2018, Limassol, Cyprus

Publisher

PsychArchives

Citation

  • Author(s) / Creator(s)
    España-Bonet, Cristina
  • Author(s) / Creator(s)
    Stiller, Juliane
  • Author(s) / Creator(s)
    Ramthun, Roland
  • Author(s) / Creator(s)
    van Genabith, Josef
  • Author(s) / Creator(s)
    Petras, Vivien
  • PsychArchives acquisition timestamp
    2018-11-05T10:52:49Z
  • Made available on
    2018-11-05T10:52:49Z
  • Date of first publication
    2018
  • Abstract / Description
    We describe a lexical resource-based process for query translation of a domain-specific and multilingual academic search engine in psychology, PubPsych. PubPsych queries are diverse in language with a high amount of informational queries and technical terminology. We present an approach for translating queries into English, German, French, and Spanish. We build a quadrilingual lexicon with aligned terms in the four languages using MeSH, Wikipedia and Apertium as our main resources. Our results show that using the quadlexicon together with some simple translation rules, we can automatically translate 85% of translatable tokens in PubPsych queries with mean adequacy over all the translatable text of 1.4 when measured on a 3-point scale [0,1,2].
    en_US
  • Publication status
    publishedVersion
  • Review status
    reviewed
  • Persistent Identifier
    https://hdl.handle.net/20.500.12034/735
  • Persistent Identifier
    https://doi.org/10.23668/psycharchives.928
  • Language of content
    eng
    en_US
  • Publisher
    PsychArchives
    en_US
  • Is part of
    12th International Conference on Metadata and Semantics Research, 2018, Limassol, Cyprus
  • Is related to
    https://doi.org/10.23668/psycharchives.1062
  • Keyword(s)
    machine translation
    en_US
  • Keyword(s)
    information retrieval
    en_US
  • Dewey Decimal Classification number(s)
    150
  • Title
    Query Translation for Cross-lingual Search in the Academic Search Engine PubPsych
    en_US
  • DRO type
    conferenceObject
    en_US
  • Leibniz institute name(s) / abbreviation(s)
    ZPID