Week 7 -- Meeting the Team and Choosing Sklearn

This week we started by meeting our team to decide on which open source project to contribute to. After considering everyone’s interest, strength, as well as how welcoming the project’s community was, we selected Sklearn, a python module providing popular machine learning algorithms. I can see it challenging to contribute to it, but I am also sure that we would learn a lot from this experience.

I collaborated with Yao and Jiawei for this project. We began by discussing our personal interests. It went quite smoothly as all of us were good at writing Python and interested in algorithm-related topics. So we quickly narrowed down our search to Pandas and Sklearn, which were quite similar to some extent.

To choose between these two, we basically went through several steps: deciding which kind of contribution we wanted to do, evaluating if there were related open issues, and evaluating if the leadership accept such contributions quickly. For the first thing, we were really looking forward to something more significant, so we decided that adding new features would be an idealized contribution. Based on this, we found that in Sklearn, there were more open issues talking about adding new features and at the same time, the leading team was less likely to reject such proposals. However, in Pandas, many issues talking about new features were refused, so we assumed that it could be harder if we contribute to Pandas.

To wrap up this meeting, we settled our regular meeting time to be every Thursday night. And we set our goal for the upcoming week to be setting up the development environment on our own machines, and selecting several potential open issues to handle. I have finished configuring the development environment by now and I am scanning the issues.

From my perspective, I truly hope that we can add at least one new feature to Sklearn. Maybe we can also do do some other small contributions like polishing several existed algorithms to make them handle more situations. I am a little bit worried about whether I personally can understand those machine learning algorithms. It is not a simple field to learn. But on the other hand, because of this, I an also looking forward to learning more about machine learning out of my ML course.

Written before or on March 12, 2023