Welcome to Zora’s home on the web :)

I am currently a 1st/3rd year PhD student at the Language Technologies Institute at Carnegie Mellon University working with Daniel Fried and Graham Neubig. My primary research interest is to build language models with interpretable and generalizable reasoning skills. Particularly, I’m working on:

Augmented Language Models [Survey]
  1. Leverage and improve programs
  2. Knowledge: retrieve texts [FilCo] [ReAtt] [RAGGED] and structured data [WikiTable] [TUTA] [K-BERT]
  3. Other modalities: act on images [TroVE], or map fixed LMs to images [SPAE]
Agent(ic System)s that Automate Human Labor
  1. Using LMs to do human tasks: e.g., data analysis [HiTab] software programming [ODEX]
  2. Facilitate human verification: increase accuracy and accelerate the process [TroVE]

News

  • Jul 2024: Will give a 3-hour tutorial about Large Language Models for Tabular Data at SIGIR 2024. Stay tuned!
  • May 2024: Organized our CMU Agent Workshop 🤖 with plenty of events – insightful tutorials, talks, and posters! I also gave two (short) tutorials about tool-augmented LMs and codegen testbeds.
  • Mar 2024: Gave a guest lecture about Language Agents at the Advanced NLP course (11-711) course, check out the recordings!
  • Mar 2024: Gave a talk at the FLAME (Foundation and LAnguage Model) 🔥 seminar about our recent survey and TroVE
  • Feb 2024: Gave a talk about Language Models with Tools at the LLM as Agent Seminar, about TroVE and works in progress 🤫
  • Feb 2024: Gave a lecture about Evaluation (metrics and benchmarks) for the Neural Code Generation (11-891) course 💻
  • Jan 2024: TAing for the new course 11-891 Neural Code Generation, reach out if you want to discuss more project ideas 🪄
🐾 Older News
  1. Nov 2023: Gave a guest lecture about Evaluation and Benchmarks for Code Generation for the Advanced NLP course (11-711) 👩‍🏫 more details [here]
  2. Aug 2023: A talk about 🛠️ Tool using, learning, and making with LLMs at Code Generation Reading Group, check out the [video]
  3. Apr 2023: Gave a talk about [ODEX] at the Machine Learning Methods in Software Engineering (video) hosted by JetBrains Research Team 👩‍💻

Get Connected

  • If you want to get connected, discuss potential project ideas, ask about CMU application, or any other relevant topics, you can book a 15-minute meeting via calendly.
  • If you are taking 11-891 (neural code generation) and want to chat about project ideas, my office hours are available every Friday 4-5 pm EST.
  • I often mentor a few students every semester. If you are an undergrad/master’s student at CMU and interested in NLP, please reach out to me by email.
  • If you are from underrepresented groups, or do not have much research experience, you are encouraged to reach out!

NLP Research Experience

Prior to CMU, I took a gap year and became an Assistant Researcher at Microsoft Research (Asia) through the Star Bridge Program. My focus then was understanding and leveraging structured knowledge such as tables, via large-scale pre-training or complex question answering and/or data-to-text generation.

During undergrad years, I was also selected as a student researcher by the Tencent Rhino-Bird Talent Cultivation Program. I was lucky to explore both offices at Shenzhen and Beijing in China, also, work on interesting projects about knowledge injection and adaptive inference via self-distillation.

Life As An Undergrad

I have received my B.S. in Mathematics from Beijing Normal University, while my explorations are a bit more diverse than this. I have studied macro-economical models for Central Bank Digital Currency (CBDC) at People’s Bank of China; configured human visual coding pathways using self-organized maps at IDG/McGovern Institute for Brain Research; captained the genetic engineering for an antibiotic-free glucose production method and presented at the iGEM conference; also, examined Chinese language models in terms of world and linguistic knowledge acquisition at Institute of Chinese Information Processing.

Academic Services

I Love My Name

My name in Chinese is 王芷若, which reads as Zhiruo Wang in Hanyu Pinyin. It is usually hard for non-native speakers to pronounce, so you can also call me Zora (as ZR is similar to Zhi Ruo). I love my name, especially in Chinese characters, since it has a more beautiful meaning than in English alphabets. 芷 stands for 白芷 (Angelica dahurica) and 若 stands for 杜若 (Pollia japonica), which are two kinds of Chinese herbal medicine. Also, 芷若 is a beautiful vanilla.

Feel free to reach out if you have any questions :)