My first Open Source Project
- Fernando Llovet
- Jun 28
- 3 min read
Hi All,
I hope you're doing well! I wanted to dedicate this blog post to my final-year Computer Science project, which I've made open source today!
This project was completed during my final year at the University of St Andrews, under the supervision of Dr. Olexandr Konovalov, a lecturer in the School of Computer Science.
While the project resulted in a 10,000-word technical report, I wanted to use this post to give a more accessible overview and, more importantly, to share what I think I did right, and what I definitely did wrong. Hopefully, you can learn a thing or two from my experience.
Project Title: Reproducible Data Mining in GitHub
Cool name, right? But what does that actually mean if you’re not from a technical background? Let me break it down:
Data Mining: This is the process of turning raw (often messy) data into something structured and usable. Think of it as cleaning up chaos so we can find patterns, train models, or make predictions.
GitHub: Imagine a mix between Google Drive and a social network, but for code. Developers can upload files, collaborate on projects, and even gain visibility and followers based on their contributions.
Reproducible: It’s one thing to build a project that works for your own use case, but it’s even better to build something that others can use and build on. My project was “reproducible” because it allows other researchers to run similar experiments with little to no code modification.
So, what was the project actually about?
In simple terms: I built a tool that connects to GitHub, retrieves data (like code files), and analyzes it automatically.
Why is that useful? Because it helps researchers and developers understand trends in coding practices and project behaviors over time. This kind of insight can be powerful in improving collaboration, quality, and innovation in software development.
If you’re curious to dive into the technical details, here’s the full report: https://github.com/theSpanishgolfer/CS4098-Git-Hub-Data-Mining/blob/main/CS4098_GitHub_Data_Mining.pdf
What I Learned :
Work on Something That Interests You: Whether you’re given a project or get to choose your own, find a way to make it interesting for you. Even within strict guidelines, there are often creative ways to steer it toward your passions.
Find Purpose, Not Just Interest: Pure curiosity is great, but for me, having a real-world problem to solve was what kept me motivated. I knew my work had a practical purpose, and that made the tough days worth it.
Leverage Your Supervisor's Expertise: Your advisor can be your best resource if you make the effort to engage. Early on, I waited for him to check in—bad move. They're not babysitters. You need to reach out, follow up, and make use of their knowledge proactively.
Own Your Project: While your supervisor is there to guide you, don’t be afraid to challenge their ideas if you believe in yours. It’s your project. Trust your instincts and advocate for your vision—respectfully, of course.
Start Early, Please: I underestimated how much time this would take. The second semester was chaos. I had to pause hobbies (like golf) and stop blogging just to catch up. Spread your workload throughout the year if you want to avoid burnout.
Leave At Least Two Weeks for the Report: I wrote mine in two intense weeks and barely made the deadline. Unsurprisingly, most of the criticism I got was about the report. Aim to finish it at least a week early so you have time to revise and polish.
Take Ethics Seriously: I changed one of the project’s goals mid-way, which required new ethics clearance. I also had to drop a potential expansion idea because it would’ve breached ethical guidelines. Don’t overlook this, it can delay or derail your progress.
Use AI Wisely (If Allowed): AI tools can be incredibly helpful, especially as “secondary advisors.” Use them to explain concepts, identify inconsistencies, or help prioritize features. A prompt I used often was: “Imagine you're [insert role], trying to achieve [goal]. Which of these features would help the most: [feature list]?” This helped me stay focused on building something useful.
Follow a Project Management Style: Even if you're a solo developer, organize your work. Use a method (Kanban, Agile, whatever suits you) to track progress, manage priorities, and stay accountable. It’ll make a massive difference in how efficiently you work.
I truly loved working on this project. It was incredibly rewarding for both my professional and personal growth. I hope this blog gives you some helpful insight if you’re planning your own project, or just starting your final year.
Now that university is over, there's a lot going on in my life and I’m excited to share that soon! I’ll do my best to keep posting, and hopefully sneak in a bit of golf too 😉.
Much love,
Fer