top of page
jumpBella.jpg

EILEEN BLUM

PhD

Home: Welcome

ABOUT ME

I am a linguistics PhD with intimate knowledge of speech sounds and the computation of patterns in natural language. My experience with CCAI has taught me just how much of an impact minor adjustments can have on a machine language model. I excel at paying attention to details while understanding their effects on the larger project. Overall, I love using my linguistic knowledge to expand the accessibility of machine language technologies and to improve user experiences.

I also have a passion for writing and editing technical documentation. I value clarity and concision and am experienced in tailoring information to a variety of audiences. I have written internal task documentation, public best practices and client-specific recommendations for using DialogflowCX. I love when something I have written can directly help others to accomplish their technical goals.


Outside of work, I also enjoy training animals and medieval armored combat. I rode and trained off-the-track thoroughbreds in the hunter/jumper discipline for over ten years. I trained my first dog as a kid and trained my late cat to perform some basic tasks on cue. I also participated in heavy combat in the Society for Creative Anachronism (SCA) and I enjoy larp and boffer fighting with an affinity for pole weapons, particularly glaive and axe.

Home: About Me

EDUCATION

PHD LINGUISTICS, RUTGERS UNIVERSITY

September 2015 - January 2023

BA LINGUISTICS, UNIVERSITY OF CALIFORNIA SANTA CRUZ

September 2012 - June 2014

INTERSEGMENTAL GENERAL EDUCATION TRANSFER CURRICULUM, DIABLO VALLEY COLLEGE

August 2010 - May 2012

Home: Education

WORK EXPERIENCE

July 2022 - Present

DIALOGUE DESIGNER, Data Piper

Google Cloud CCAI

  • Advise clients on how to improve virtual agents using DialogflowCX

  • Create sample data and write labeling instructions for LLM evaluation
  • Annotate and summarize live agent conversations for comparison with LLM summaries
  • Complete project management certificate, Coursera

July 2021 - July 2022

DIALOGUE DESIGNER, Tek Systems

Google Cloud CCAI

  • Identified data quality issues, Created a plan, and resolved NLU and UX problems; Implemented changes directly within DialogflowCX

  • Annotated conversation data to identify voice and chat bot failures and successes

  • Developed, tested, and deployed a new label taxonomy; Oversaw training of 15 team members 

  • Utilized cross-functional collaboration to improve labeling process  

  • Wrote 17 process documents to track changes and train new team members 

  • Contributed to two best practices documents for DialogflowCX, published by Google

  • Wrote documentation guide, Proofread and edited ReadMe for new SCRAPI Python library

September 2015 - June 2021

LINGUISTICS FELLOW & TA, Rutgers University

  • Executed and documented two major research projects over eight years 

  • Employed cross-discipline collaboration to improve arguments 

  • Presented cutting-edge research to expert audiences at local conferences 

  • Strong proficiency with IPA and excellent understanding of other linguistic representations 

  • Created course content and served as primary instructor for two introductory linguistics and two expository writing courses with up to 30 students each 

  • Organized and hosted first PhD to Industry informational event with five panelists and up to 50 attendees 

  • Orchestrated summer mini-course with five lessons on methods of artificial learning 

  • Coordinated colloquium series for two years 

Home: Experience

PROJECTS

DISSERTATION

The effects of non-linear data structures on the computation of vowel harmony (Jan 2023)

I apply formal language theory to natural language data in order to analyze the computational complexity of vowel harmony patterns across both well studied and understudied languages. I use this computational approach to investigate the effects of different representational data structures on complexity and I develop a new theory of autosegmental locality.

METAL LYRICS GENERATOR

Erdös Institute Natural Language Processing Bootcamp (February-March, 2021)

My partner and I built a Wasserstein Generative Adversarial Network (WGAN) in Python to generate automated song lyrics in lines of 8 words at a time. We compared our WGAN with a Soft-GAN trained on the same dataset. We trained both GANs on a Kaggle dataset of metal song lyrics, which we processed using NLTK and pandas. The GANs were built using Keras, Tensorflow, and Numpy. Lastly, we calculated BLEU scores for both models and determined that neither generated very natural sounding lyrics: WGAN received all 0s, Soft-GAN averaged 0.06 for n-grams of length 1-4.

METAL OR NOT?

Erdös Institute Data Science Bootcamp (May 2020)

My partner and I created a classifier in Python to distinguish song lyrics by genre. We used two Kaggle data sets of song lyrics which were cleaned using the GenSim and NLTK packages. We then used the shallow neural network in the Word2Vec package to create high-dimensional word vectors, PCA and k-clustering to group them based on semantic similarity, and trained a DecisionTreeClassifier to distinguish lyric sets. The classifier achieved 81% accuracy.

QUALIFYING PAPER 2

On the locality of vowel harmony over autosegmental representations (2018)

I applied Formal Language Theory to natural language data in order to analyze the computational complexity of vowel harmony patterns over autosegmental representations. I analyzed vowel harmony patterns in multiple languages and predicted possible cross-linguistic variation.

QUALIFYING PAPER 1

Allophony-driven stress in Munster Irish (2018)

I designed and implemented a production experiment to determine the word stress pattern of Munster Irish (Gaelic). I organized all the data files by hand in Excel, then annotated, transcribed, and analyzed all of the acoustic data by hand in Praat. Statistic analyses was performed using t-tests in Excel then verified using linear mixed effects models in R.

Home: Projects
Home: Projects
bottom of page