How to count by group in R

Counting by many groups — sometimes referred to as crosstab reviews — can be a beneficial way to seem at details ranging from community feeling surveys to health care assessments. For instance, how did men and women vote by gender and age team? How quite a few software package developers who use both R and Python are males vs. ladies?

There are a great deal of approaches to do this type of counting by groups in R. In this article, I’d like to share some of my favorites.

For the demos in this report, I’ll use a subset of the Stack Overflow Developers survey, which surveys developers on dozens of topics ranging from salaries to systems utilized. I’ll whittle it down with columns for languages utilized, gender, and if they code as a passion. I also added my personal LanguageGroup column for whether a developer described using R, Python, both, or neither.

If you’d like to follow together, the previous page of this report has guidelines on how to download and wrangle the details to get the identical details set I’m using.

The details has 1 row for each survey response, and the 4 columns are all characters.

str(mydata)
'data.frame':83379 obs. of  four variables:
 $ Gender            : chr  "Person" "Person" "Person" "Person" ...
 $ LanguageWorkedWith: chr  "HTML/CSSJavaJavaScriptPython" "C++HTML/CSSPython" "HTML/CSS" "CC++C#PythonSQL" ...
 $ Hobbyist          : chr  "Certainly" "No" "Certainly" "No" ...
 $ LanguageGroup     : chr  "Python" "Python" "Neither" "Python" ...

I filtered the raw details to make the crosstabs much more workable, together with eliminating missing values and using the two major genders only, Person and Female.