By Emily Gong published on Jul 15th, 2023

DataSquad Spotlight: Kristian Allen


Kristian Allen joined UCLA Library as the Software Architect in 2008 and has been with the Digital Library Program and Data Science Center for over 15 years. As an integral part of the DSC, Kristian mainly focuses on three areas of work: teaching and instruction, deep learning machine, and working with DataSquad.

Teaching and Instruction

Certified Carpentries Instructor Kristian primarily teaches lessons related to Python, SQL and OpenRefine. Carpentries is a non-profit organization that develops a curriculum to teach CS and programming skills to people who come from non-technical backgrounds; it serves as a great resource for graduate students, staff members, and advanced undergraduates at UCLA and beyond. As a teacher, Kristian said his goal is “meet the learners where they are and make them feel confident in their abilities.” Every year, Kristian not only regularly teaches the topics he specializes in (twice a year), but also helps at other workshops and experiments with more nuanced topics, such as the recent workshop in Intro to AI for GLAM (Galleries, Libraries, Archives, and Museums).

In addition to what Carpentries usually suggests instructors do, Kristian pays special attention to learners’ feedback and makes the lessons more interesting. Since workshops have a set amount of time, it’s easy to lose track of time and miss certain parts of the lesson. For example, Kristian realized that students find the graphing section of the Python tutorial much more useful than the section on writing tests for code. Thus, he changed the lesson to accommodate what the learners were asking for - prioritizing graphing, which is more valuable to the participants. Kristian also makes an effort to use data that’s interesting to the learner rather than using any available data.

Deep Learning Machine

In addition to teaching, Kristian works alongside Leigh Phan on the Library’s Deep Learning Machine (DLM). In the past 3 years, there has been an uptaking demand for AI and ML techniques to analyze data, according to Kristian. With a GPU processor, the DLM aims to provide a quick, low-barrier-to-entry solution for researchers who hope to scale their data analysis process without waiting in front of their local computers for days, or the prohibitive costs of cloud services.

Kristian has led the process of making a user management system on the DLM so that researchers can easily log in without complicated terminal commands, which can be intimidating for people with little CS experience. After surveying with researchers, Kristian mentioned that most users are using R and Python. Hence, the DSC is building both RStudio and Jupyter Lab interfaces - aiming to provide the simplest solution that works for the widest audience.

As more and more researchers and graduate students come to the DSC with medium to large-sized data, the DLM will provide great convenience to the campus community. “Once the login system is complete,” Kristian said, “the Deep Learning Machine will be open to anyone on campus.”

Working with DataSquad

Last but not least, Kristian works closely with DataSquad members on consulting projects. He sets up the project for student workers by providing background information and preliminary research, such as helpful libraries and examples from other sources. Rather than having nebulous project scopes, Kristian clearly defines objectives and breaks them down into smaller tasks for consultants, which has been extremely helpful for the students.

For example, for the Orang Asli Study, Kristian frequently meets with the DataSquad consultant, Shail, to discuss coding issues and provide feedback. When Shail worked on the base cases for text extraction, Kristian helped identify edge cases and provided detailed instructions on handling them. Combining the methods, Shail developed a robust way to help reduce ambiguity when extracting question text in a noisy environment with proprietary encodings, such as Microsoft Word, and ambiguous and dual-meaning symbols, like parentheses, enumerated statements and missing termination data.

While Kristian’s known as the Software Architect on his profile, he does so much more than build software for Digital Library Program and Data Science Center - he’s an experienced instructor, a skilled developer, and a dedicated mentor. We appreciate Kristian’s contribution to all the areas we mentioned above and the ones we couldn’t mention in this highlight article!