You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am pursuing a bachelor's degree in mathematics, computer science, and data science at the University of Wisconsin-Madison. I am familiar with Java, C/C++, JavaScript, Python, R, and a little bit about HTML/CSS. My main interests in CS are arithmetic algorithms, cryptology, and optimization problems.
My first choice of code editor is VSCode, and my second choice is Vim.
For VSCode, the main reason I like it is because it has an abundant and mature ecosystem for most languages and tools. It is easy to set up an eligible working environment in a short period with mainstream toolkits embedded, like git, docker, etc. Also, there are many beautiful themes I love (especially monokai).
For Vim, I like it because it is really easy to call up and use. This lightweight editor saved me a lot of time when checking logs and outputs on the Linux VM.
Programming experience
Python: I use Python to do most of the Machine learning jobs like training NLP models through SpaCy, handling big data analysis through Hadoop-related software(Cassandra, Spark, Kafka), and writing some code to assist my math homework like checking if a matrix is totally unimodular and simulate Feistel cipher in CFB mode. Also, I am familiar with Django, I am currently working on a project about helping patients understand doctor's notes which takes Django as the backend. C/C++: I used to implement a simple shell using C which supports pipe, run commands in detached mode, and output redirection. I also implemented several other school projects including a simple automatic garbage memory collection, multi-thread merge sort, etc. Sometimes I will use C++ as a substitution for Python to implement some code to assist my math homework like implementing the simplex method using Tabular to solve LP problems. Java: I use Java to implement a personalized version of iperf to test internet connectivity and performance on the Mininet. I am also familiar with casting and data encapsulation, multithreading, network communication through java, etc. JavaScript: I will state this part in detail in the next section. R: I learned R mostly from classes like data modeling. I am familiar with using R to plot various kinds of statistical diagrams, doing hypothesis tests, and evaluating regression models.
JavaScript experience
During high school, my partner and I constructed a game bot AI(generals.io) for a research project. I was in charge of data collection using a crawler to gain gaming replay data in JavaScript and did data transformation including slicing and replaying gamer's contest. For now, I am working on a project that developed a Chrome extension to help people read doctor's notes. I helped with the front end which let people directly select and highlight content they want to learn about using mainly javaScript cooperating with some Chrome-provided API.
The one feature of JavaScript I liked most is its flexibility. As I stated above, it can be used in various environments and jobs. The reason behind this is there is a very mature ecosystem related to JavaScript, which makes it a very welcoming language.
The thing I dislike the most is that JavaScript has a very blurry and loose typing system, which brought me a lot of trouble and confusion when I learned JavaScript. I prefer a more strict and explicit typing system rather than a blurry one.
Node.js experience
I am not very experienced in using Node.js, but I am familiar with the basic concepts and usage of it.
C/Fortran experience
I am experienced in C. C/C++ are the first two programming languages I learned. I took the Computer Organization and Operating System courses which all use C programming heavily. Thanks to these lectures, I am familiar with C programming's memory structure, multithreading, multiprocessing, data encapsulation, etc.
Interest in stdlib
As a student studying math, I really appreciate the purpose and goal of projects like stdlib. The existence of these libraries makes our life easier a lot. For instance, I don't need to handwrite all the code from scratch to simulate several random variables in various distributions. Therefore, I'd like to help develop and make this library better.
Version control
Yes
Contributions to stdlib
I've not yet contributed to stdlib, but I believe this is going to be a great time to start working on contributing something.
Goals
Basic Expectation: implement the multivariate normal distribution just like all other implemented, including but not limited to the following functions:
PDF: Probability density function.
logPDF: Log of the probability density function.
CDF: Cumulative density function.
logCDF: Log of the probability density function.
MGF: Moment generating function for multivariate normal distribution.
Entropy: Compute the differential entropy of the multivariate normal.
mean/
One should be able to create an RV in multivariate normal distribution by x=multiNormal([a_1,b_1],[a_2,b_2],...) or giving a 2-D covariance matrix byx=multiNormal([[c_11,c_12,...],[c_21,c_22],...])
Also, this multivariate normal distribution should cooperate with the plot function that generates the expected diagrams like the existing implementation of other distributions.
Bigger Picture
Beyond the basic expectation, I will consider implementing several other multivariate distributions like multivariate hypergeometric/exponential/Bernoulli distribution.
Why this project?
I am deeply interested in contributing to this project, driven by my strong desire to apply my mathematical background to a math-related open-source library. With a foundation in both computer science and mathematics, especially in the realm of probability, I find this project to be a perfect match for my skills and interests. My academic and practical experiences have equipped me with a robust understanding of mathematical concepts and their computational implementations, making me keenly aware of the challenges and opportunities in developing mathematically rigorous and efficient algorithms. I am eager to contribute by leveraging my knowledge in probability and mathematical analysis. Joining this project represents a unique opportunity for me to merge my passion for mathematics with my computer science expertise, contributing to a library that is pivotal in advancing open-source, math-centric computing solutions.
Qualifications
I have taken college-level probability theory and stochastic processes courses at the university. I am also doing research directed by a professor in statistics, mostly about stochastic processes and probability distribution with multivariable. Also, I am familiar with tools like Wolfram Alpha, random distribution package in R, and writing personalized code (mostly in Python) for solving problems in Linear programming, cryptology, group theory, etc. Overall, I have a matching mathematical background and understanding of the demands of target users, which make me an eligible candidate for this project.
Prior art
Scipy has some decent implementation of multivariate normal distribution. R also had a package that implemented multivariate normal distribution. Julia also supports the multivariate normal distribution.
Commitment
I will finish all my final exams in mid-May, and I can work about 20-30 hrs/week for 12 weeks. I will be located in the US Central timezone and will quickly response to all the messages and video meetings.
Schedule
Assuming a 12 week schedule,
Community Bonding Period:
Task: Do research about similar implementations in other languages, write user stories, and turn them into the feature list.
Goal: Summarize the implementation details, write out pseudocode, and list several potential use cases.
Week 1-2:
Task: Write the first draft, do some basic testing, communicate with mentors, and receive feedback.
Goal: Have something runnable and meaningful to do a demo (maybe fragile)
Week 3-4:
Task: Revise the first draft code, examine potential bugs, do more tests
Goal: No obvious bugs, concise calling method,
Week 5-6:(midterm)
Task: Optimize the code, communicate with the community, and think about if there is anything expandable.
Goal: deliverable code that is ready to be used and tested by a small group of potential users.
Week 7-8:
Task: have an alpha test with potential users, listen to their feedback(including bug reports, feature requirements, potential improvement, and optimization), and revise the code
Week 9-10:
Task: have a beta test with a wider range of developers. Meanwhile, revise the code by doing some further optimization to the original code to speed up and make some modifications to align with users' requirements.
Week 11:
Task: Final round of the test, actively communicate with the stdlib developing community, asking for their suggestion about the pre-publish stage.
Goal: code that is ready to be published.
Week 12:
Task: Submit the code and wrap up.
Final Week:
Notes:
The community bonding period is a 3 week period built into GSoC to help you get to know the project community and participate in project discussion. This is an opportunity for you to setup your local development environment, learn how the project's source control works, refine your project plan, read any necessary documentation, and otherwise prepare to execute on your project project proposal.
Usually, even week 1 deliverables include some code.
By week 6, you need enough done at this point for your mentor to evaluate your progress and pass you. Usually, you want to be a bit more than halfway done.
By week 11, you may want to "code freeze" and focus on completing any tests and/or documentation.
During the final week, you'll be submitting your project.
I have read and understood the application materials found in this repository.
I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
The issue name begins with [RFC]: and succinctly describes your proposal.
I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/before the submission deadline.
The text was updated successfully, but these errors were encountered:
Thanks for your proposal! To strengthen it, I would suggest to highlight your plans for integrating this new distribution with the existing stdlib codebase, especially since it relies on multi-dimensional arrays, which are not part of the native JavaScript language.
It's good that you reference prior art in other languages such as R and Julia, but it would also be beneficial to discuss more in-depth how the multivariate normal distribution will be implemented by you or is implemented in these reference implementations. For example, how will the covariance matrix be handled? What numerical methods will be used? This way, aside your highly relevant academic experience and achievements, you could further demonstrate your experience with numerical computing and assure the reviewers that you have the necessary skills to pull off this project.
@BrianP2002 Following up on Philipp's comments, I'd also like to add
As part of our application requirements, for any application to be considered, a contributor must land a patch to the main project repository. If this requirement is not fulfilled, we will not consider the respective application.
In your timeline, you mentioned user feedback. How do you plan to acquire such feedback? Who is your target audience? My sense is that you are highly unlikely to get substantial feedback or idea a sufficient body of potential users, especially given JavaScript's current standing as a language for scientific computation. In which case, if you don't have a clear user feedback plan, I suggest expanding the technical activities of your proposal accordingly, potentially to include other multivariate distributions or higher level functionally which relies on the multivariate normal distribution.
Full name
Lin ha
University status
Yes
University name
University of Wisconsin-Madison
University program
Computer Science, Mathematics, Data Science
Expected graduation
2025 Spring
Short biography
I am pursuing a bachelor's degree in mathematics, computer science, and data science at the University of Wisconsin-Madison. I am familiar with Java, C/C++, JavaScript, Python, R, and a little bit about HTML/CSS. My main interests in CS are arithmetic algorithms, cryptology, and optimization problems.
Timezone
US Central Time(UTC−06:00)
Contact details
email: halinbr2002@gmail.com, github: https://github.com/BrianP2002, phone: +1 6089773640
Platform
Linux
Editor
My first choice of code editor is VSCode, and my second choice is Vim.
For VSCode, the main reason I like it is because it has an abundant and mature ecosystem for most languages and tools. It is easy to set up an eligible working environment in a short period with mainstream toolkits embedded, like git, docker, etc. Also, there are many beautiful themes I love (especially monokai).
For Vim, I like it because it is really easy to call up and use. This lightweight editor saved me a lot of time when checking logs and outputs on the Linux VM.
Programming experience
Python: I use Python to do most of the Machine learning jobs like training NLP models through SpaCy, handling big data analysis through Hadoop-related software(Cassandra, Spark, Kafka), and writing some code to assist my math homework like checking if a matrix is totally unimodular and simulate Feistel cipher in CFB mode. Also, I am familiar with Django, I am currently working on a project about helping patients understand doctor's notes which takes Django as the backend.
C/C++: I used to implement a simple shell using C which supports pipe, run commands in detached mode, and output redirection. I also implemented several other school projects including a simple automatic garbage memory collection, multi-thread merge sort, etc. Sometimes I will use C++ as a substitution for Python to implement some code to assist my math homework like implementing the simplex method using Tabular to solve LP problems.
Java: I use Java to implement a personalized version of iperf to test internet connectivity and performance on the Mininet. I am also familiar with casting and data encapsulation, multithreading, network communication through java, etc.
JavaScript: I will state this part in detail in the next section.
R: I learned R mostly from classes like data modeling. I am familiar with using R to plot various kinds of statistical diagrams, doing hypothesis tests, and evaluating regression models.
JavaScript experience
During high school, my partner and I constructed a game bot AI(generals.io) for a research project. I was in charge of data collection using a crawler to gain gaming replay data in JavaScript and did data transformation including slicing and replaying gamer's contest. For now, I am working on a project that developed a Chrome extension to help people read doctor's notes. I helped with the front end which let people directly select and highlight content they want to learn about using mainly javaScript cooperating with some Chrome-provided API.
The one feature of JavaScript I liked most is its flexibility. As I stated above, it can be used in various environments and jobs. The reason behind this is there is a very mature ecosystem related to JavaScript, which makes it a very welcoming language.
The thing I dislike the most is that JavaScript has a very blurry and loose typing system, which brought me a lot of trouble and confusion when I learned JavaScript. I prefer a more strict and explicit typing system rather than a blurry one.
Node.js experience
I am not very experienced in using Node.js, but I am familiar with the basic concepts and usage of it.
C/Fortran experience
I am experienced in C. C/C++ are the first two programming languages I learned. I took the Computer Organization and Operating System courses which all use C programming heavily. Thanks to these lectures, I am familiar with C programming's memory structure, multithreading, multiprocessing, data encapsulation, etc.
Interest in stdlib
As a student studying math, I really appreciate the purpose and goal of projects like stdlib. The existence of these libraries makes our life easier a lot. For instance, I don't need to handwrite all the code from scratch to simulate several random variables in various distributions. Therefore, I'd like to help develop and make this library better.
Version control
Yes
Contributions to stdlib
I've not yet contributed to stdlib, but I believe this is going to be a great time to start working on contributing something.
Goals
Basic Expectation: implement the multivariate normal distribution just like all other implemented, including but not limited to the following functions:
One should be able to create an RV in multivariate normal distribution by
x=multiNormal([a_1,b_1],[a_2,b_2],...)
or giving a 2-D covariance matrix byx=multiNormal([[c_11,c_12,...],[c_21,c_22],...])
Also, this multivariate normal distribution should cooperate with the plot function that generates the expected diagrams like the existing implementation of other distributions.
Bigger Picture
Beyond the basic expectation, I will consider implementing several other multivariate distributions like multivariate hypergeometric/exponential/Bernoulli distribution.
Why this project?
I am deeply interested in contributing to this project, driven by my strong desire to apply my mathematical background to a math-related open-source library. With a foundation in both computer science and mathematics, especially in the realm of probability, I find this project to be a perfect match for my skills and interests. My academic and practical experiences have equipped me with a robust understanding of mathematical concepts and their computational implementations, making me keenly aware of the challenges and opportunities in developing mathematically rigorous and efficient algorithms. I am eager to contribute by leveraging my knowledge in probability and mathematical analysis. Joining this project represents a unique opportunity for me to merge my passion for mathematics with my computer science expertise, contributing to a library that is pivotal in advancing open-source, math-centric computing solutions.
Qualifications
I have taken college-level probability theory and stochastic processes courses at the university. I am also doing research directed by a professor in statistics, mostly about stochastic processes and probability distribution with multivariable. Also, I am familiar with tools like Wolfram Alpha, random distribution package in R, and writing personalized code (mostly in Python) for solving problems in Linear programming, cryptology, group theory, etc. Overall, I have a matching mathematical background and understanding of the demands of target users, which make me an eligible candidate for this project.
Prior art
Scipy has some decent implementation of multivariate normal distribution.
R also had a package that implemented multivariate normal distribution.
Julia also supports the multivariate normal distribution.
Commitment
I will finish all my final exams in mid-May, and I can work about 20-30 hrs/week for 12 weeks. I will be located in the US Central timezone and will quickly response to all the messages and video meetings.
Schedule
Assuming a 12 week schedule,
Community Bonding Period:
Task: Do research about similar implementations in other languages, write user stories, and turn them into the feature list.
Goal: Summarize the implementation details, write out pseudocode, and list several potential use cases.
Week 1-2:
Task: Write the first draft, do some basic testing, communicate with mentors, and receive feedback.
Goal: Have something runnable and meaningful to do a demo (maybe fragile)
Week 3-4:
Task: Revise the first draft code, examine potential bugs, do more tests
Goal: No obvious bugs, concise calling method,
Week 5-6:(midterm)
Task: Optimize the code, communicate with the community, and think about if there is anything expandable.
Goal: deliverable code that is ready to be used and tested by a small group of potential users.
Week 7-8:
Task: have an alpha test with potential users, listen to their feedback(including bug reports, feature requirements, potential improvement, and optimization), and revise the code
Week 9-10:
Task: have a beta test with a wider range of developers. Meanwhile, revise the code by doing some further optimization to the original code to speed up and make some modifications to align with users' requirements.
Week 11:
Task: Final round of the test, actively communicate with the stdlib developing community, asking for their suggestion about the pre-publish stage.
Goal: code that is ready to be published.
Week 12:
Task: Submit the code and wrap up.
Final Week:
Notes:
Related issues
No response
Checklist
[RFC]:
and succinctly describes your proposal.The text was updated successfully, but these errors were encountered: