Skip to content
Change the repository type filter

All

    Repositories list

    • Examples

      Public
      Various examples for different articles
      HTML
      9216100Updated Sep 29, 2024Sep 29, 2024
    • cdata

      Public
      Higher order fluid or coordinatized data transforms in R. Distributed under choice of GPL-2 or GPL-3 license.
      R
      Other
      84400Updated Sep 26, 2024Sep 26, 2024
    • vtreat

      Public
      vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under choice of GPL-2 or GPL-3 license.
      HTML
      Other
      4528400Updated Jun 13, 2024Jun 13, 2024
    • pyvtreat

      Public
      vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distributed under a BSD-3-Clause license.
      Python
      Other
      811920Updated Jun 13, 2024Jun 13, 2024
    • WVPlots

      Public
      Pre-packaged plots in R
      R
      Other
      228400Updated Apr 22, 2024Apr 22, 2024
    • Ad-ins and keyboard shortcuts for building calculation pipelines in R
      R
      Other
      693201Updated Feb 20, 2024Feb 20, 2024
    • wvpy

      Public
      Tools to convert from Jupyter notebooks to and from Python .py files, and render.
      HTML
      Other
      0800Updated Feb 16, 2024Feb 16, 2024
    • Codd method-chained SQL generator and Pandas data processing in Python.
      Python
      BSD 3-Clause "New" or "Revised" License
      511420Updated Oct 19, 2023Oct 19, 2023
    • Simple example of how to use an embedding plus sphering/whitening transform to measure difference in distribution.
      Jupyter Notebook
      BSD 3-Clause "New" or "Revised" License
      0000Updated Sep 7, 2023Sep 7, 2023
    • Implement the rquery piped query algebra in R using data.table. Distributed under choice of GPL-2 or GPL-3 license.
      R
      Other
      33700Updated Aug 20, 2023Aug 20, 2023
    • rquery

      Public
      Data Wrangling and Query Generating Operators for R. Distributed under choice of GPL-2 or GPL-3 license.
      HTML
      Other
      1510910Updated Aug 20, 2023Aug 20, 2023
    • Dynamic Programming implemented in Rcpp. Includes example partition and out of sample fitting applications.
      C++
      Other
      21400Updated Aug 20, 2023Aug 20, 2023
    • sigr

      Public
      Concise formatting of significances in R (GPL3 license).
      HTML
      Other
      22700Updated Aug 20, 2023Aug 20, 2023
    • wrapr

      Public
      Wrap R for Sweet R Code
      R
      Other
      1113510Updated Aug 19, 2023Aug 19, 2023
    • wvu

      Public
      Win Vector LLC Python data science teaching tools (graphs and data manipulation)
      HTML
      Other
      0000Updated Feb 23, 2023Feb 23, 2023
    • Viewable pages from WinVector LLC view at: http://winvector.github.io
      HTML
      322300Updated Dec 29, 2022Dec 29, 2022
    • Working an example of supervised machine learning in Python
      Jupyter Notebook
      Other
      0100Updated Oct 3, 2022Oct 3, 2022
    • seplyr

      Public
      Improved Standard Evaluation Interfaces for Common Data Manipulation Tasks
      R
      Other
      124900Updated Aug 21, 2022Aug 21, 2022
    • PDSwR2

      Public
      Code, Data, and Examples for Practical Data Science with R 2nd edition (Nina Zumel and John Mount) https://github.com/WinVector/PDSwR2
      HTML
      Other
      12113000Updated Feb 6, 2022Feb 6, 2022
    • Example of how to build a simple R package
      R
      0200Updated Nov 27, 2020Nov 27, 2020
    • replyr

      Public
      Patches for using dplyr with Databases and Big Data
      HTML
      Other
      126700Updated Oct 18, 2020Oct 18, 2020
    • Logistic

      Public
      Experimental logistic regression code supporting multiple result categories, many levels of categorical modeling variables, good optimization, L2 regularization and more.
      Java
      593500Updated Oct 14, 2020Oct 14, 2020
    • OutOfCore

      Public
      Example of out of core coding techniques
      Java
      1200Updated Jun 22, 2020Jun 22, 2020
    • Simple example of Locality Sensitive Hashing
      Java
      61400Updated Jun 22, 2020Jun 22, 2020
    • Importance Sampling Example
      Java
      2200Updated Jun 22, 2020Jun 22, 2020
    • LStep

      Public
      Trivial demonstration of a diverging Newton-Raphson step when solving a logistic regression
      Java
      Other
      2200Updated Jun 22, 2020Jun 22, 2020
    • Java code to build synthetic data sets that match reported summary totals. Helps explore possible range of variation.
      Java
      Other
      1000Updated Jun 22, 2020Jun 22, 2020
    • Experimental pure Java revised simplex linear program solver (Apache 2.0 license)
      Java
      31500Updated Jun 22, 2020Jun 22, 2020
    • Code and data for "The Geometry of Classifiers"
      R
      GNU General Public License v3.0
      112600Updated Jun 22, 2020Jun 22, 2020
    • Example code for articles on sessionizing data.
      GNU General Public License v2.0
      2000Updated Jun 22, 2020Jun 22, 2020