Human Language Technology Center of Excellence

Dr. Peter Viechnicki joins the HLTCOE as its new Director

September 15, 2020

Dr. Peter Viechnicki, a highly respected technologist with extensive experience as a research manager, has been appointed to serve as Director of the Human Language Technology Center of Excellence (HLTCOE). Viechnicki, who joined the University on 15 September, received his PhD in Linguistics from the University of Chicago with a dissertation in phonetics. He brings […]

Building OCR/NER Test Collections

March 16, 2020

Named entity recognition (NER) identifies spans of text that contain names. Many researchers have reported the results of NER on text created through optical character recognition (OCR) over the past two decades. Unfortunately, the test collections that support this research are annotated with named entities after OCR has been run. This means that the collection […]

HLTCOE a Top Performer at VoxSRC

January 29, 2020

JHU HLTCOE was a top performer in a recent open speaker recognition challenge called VoxSRC, finishing in the top two of more than fifty entries from the international research community. The challenge, hosted by the University of Oxford, was based on the open-source VoxCeleb speech corpus captured from public celebrity videos with automatic speaker labeling […]

Processing videos by combining visual and audio cues

December 12, 2019

Videos include information in both the visual and audio domains, and so video processing techniques should utilize both of these means for more effective solutions. The HLTCOE has been researching this strategy since 2017, when recognizing individuals in videos using both voice and face was a topic at the SCALE summer workshop. The work resulting […]

A Dataset to Support Research on Bilingual Lexicons for Machine Translation (MT)

November 25, 2019

Bilingual lexicons (or bilingual dictionaries) are valuable resources for machine translation. For example, when working with technical documents like patents, a bilingual lexicon consisting of technical jargon is important for ensuring that the translation is precise and correct. At the conference on Empirical Methods in Natural Language Processing (EMNLP) in November 2019, JHU researchers released […]

Analyzing Neural Models by Freezing Subnetworks

October 15, 2018

by Kevin Duh, Senior Research Scientist The SCALE2018 Machine Translation workshop focused on building resilient neural machine translation systems for new domains. In addition to developing new algorithms to improve translation accuracy, the team also dedicated significant efforts to analysis techniques in order to understand when and why neural networks work. Neural network models, […]

JHU Team Gets High Marks in DIHARD Challenge

October 15, 2018

by Greg Sell, Senior Research Scientist A team of Johns Hopkins researchers from the HLTCOE and CLSP participated in the recent DIHARD challenge. In the evaluation, teams were given audio recordings of speech from a diverse set of conditions with an unknown number of speakers, with the goal of correctly marking the times that each […]

MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models

February 24, 2017

Cross-lingual information extraction is the task of distilling facts from foreign language (e.g. Chinese text) into representations in another language that is preferred by the user (e.g. English tuples). Conventional pipeline solutions decompose the task as machine translation followed by information extraction (or vice versa). We propose a joint solution with a neural sequence model, […]

NVIDIA DGX-1 Comes to the HLTCOE

February 24, 2017

by Kevin Duh, Senior Research Scientist The HLTCOE has acquired a NVIDIA DGX-1 Deep Learning System to support its numerous deep learning research efforts. With eight Tesla P100 GPUs, this brand new system delivers 170 teraflops of computing power (equivalent to 250 x86 servers) and has been described as an “AI Supercomputer in a […]

Robust Word Recognition Via Semi-Character Recurrent Neural Network

February 24, 2017

The Cambridge University effect from the psycholinguistics literature has demonstrated a robust word processing mechanism in humans, where jumbled words (e.g. Cmabrigde /Cambridge) are recognized with little cost. Inspired by the findings from the Cambrigde University effect, we propose a word recognition model based on a semi-character level recursive neural network (scRNN). In our experiments, […]

Archives

Dr. Peter Viechnicki joins the HLTCOE as its new Director

Building OCR/NER Test Collections

HLTCOE a Top Performer at VoxSRC

Processing videos by combining visual and audio cues

A Dataset to Support Research on Bilingual Lexicons for Machine Translation (MT)

Analyzing Neural Models by Freezing Subnetworks

JHU Team Gets High Marks in DIHARD Challenge

MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models

NVIDIA DGX-1 Comes to the HLTCOE

Robust Word Recognition Via Semi-Character Recurrent Neural Network

Upcoming Events

Archived COE News