Olivia (Simin) Fan

  • Ph.D. candidate in Machine Learning at École Polytechnique Fédérale de Lausanne (EPFL), advised by Prof. Martin Jaggi.

  • B.Sc. (honor) in Computer Science at University of Michigan, previously worked with Prof. Rada Mihalcea, Prof. Lu Wang and Prof. Jie Liu.

  • B.Sc. (government honor) in Electrical and Computer Engineering at Shanghai Jiao Tong University.

  • I am mainly working on Skiing, Photography, Piano, Singing, Ballet&Yoga, Badminton, with Leisure time hobby as Machine Learning research ;).

  •    ================================================================

        ❀Welcome to Olivia's WonderLand 🥕
                                          Wish you a nice day! :)

    See my work

    Research Explorations🧐

    My research interests lie in effective and efficient training of large foundation models, especially LLMs, from the following perspectives:
    • Data curriculum design for efficient pretraining;
    • Understanding LLM training dynamics and generalization behaviours across various domains;
    • Augment LLMs with external tools to facilitate rigorous science.

    Publications

    Irreducible Curriculum for Language Model Pretraining

    Simin Fan, Martin Jaggi.
    [NeurIPS 2023 Workshop ATTRIB]

    DOGE: Domain Reweighting with Generalization Estimation

    Simin Fan, Matteo Pagliardini, Martin Jaggi.
    [NeurIPS 2023 Workshop ALOE]

    ReadingQuizMaker: A Human-NLP Collaborative System that Supports Instructors to Design High-Quality Reading Quiz Questions

    Xinyi Lu, Simin Fan, Jessica Houghton, Lu Wang, Xu Wang.
    [CHI 2023]

    Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational Needs

    Xu Wang, Simin Fan, Jessica Houghton, Lu Wang.
    [NAACL 2022]

    Genetic Risk Converges on Regulatory Networks Mediating Early Type-2 Diabetes

    Walker JT, Saunders DC, Rai V, Dai C, Orchard P, Hopkirk AL, Reihsmann CV, Tao Y, Fan S, Shrestha S, Varshney A, Wright JJ, Pettway YD, Ventresca C, Agarwala S, Aramandla R, Poffenberger G, Jenkins R, Hart NJ, Greiner DL, Shultz LD, Bottino R, Liu J, Parker SC, Powers AC, Brissova M.
    [Nature 2023]

    Historical OCR Text Quality Analysis and Post-correction

    Instructor: Prof. Sindhu Kutty (UMICH)
    Sponsors: Dr. John Dillon, Dr. Dan Hepp(ProQuest)
    [Multi-disciplinary Project with ProQuest 2022]

       Get in Touch🤗