R 패키지 메타데이터와 수집 신호를 모아 봅니다.
첫 화면에서 판단해야 할 수집 신호를 먼저 배치합니다.
DESCRIPTION에서 감지한 backend 관련 package입니다.
기본 메타데이터를 작은 카드와 토큰으로 압축합니다.
| Package | Type | Spec |
|---|---|---|
| badger CRAN · 0.1.1 · 2026-05-30 | Imports | badger |
| data.table CRAN · 0.1.1 · 2026-05-30 | Imports | data.table |
| dplyr CRAN · 0.1.1 · 2026-05-30 | Imports | dplyr |
| framecleaner CRAN · 0.1.1 · 2026-05-30 | Imports | framecleaner |
| gtools CRAN · 0.1.1 · 2026-05-30 | Imports | gtools |
| janitor CRAN · 0.1.1 · 2026-05-30 | Imports | janitor |
| listviewer CRAN · 0.1.1 · 2026-05-30 | Imports | listviewer |
| magrittr CRAN · 0.1.1 · 2026-05-30 | Imports | magrittr |
| purrr CRAN · 0.1.1 · 2026-05-30 | Imports | purrr |
| rlang CRAN · 0.1.1 · 2026-05-30 | Imports | rlang |
| rlist CRAN · 0.1.1 · 2026-05-30 | Imports | rlist |
| scales CRAN · 0.1.1 · 2026-05-30 | Imports | scales |
| stringr CRAN · 0.1.1 · 2026-05-30 | Imports | stringr |
| tibble CRAN · 0.1.1 · 2026-05-30 | Imports | tibble |
| tidyr CRAN · 0.1.1 · 2026-05-30 | Imports | tidyr |
| tidyselect CRAN · 0.1.1 · 2026-05-30 | Imports | tidyselect |
| utils CRAN · 0.1.1 · 2026-05-30 | Imports | utils |
| knitr CRAN · 0.1.1 · 2026-05-30 | Suggests | knitr |
| rmarkdown CRAN · 0.1.1 · 2026-05-30 | Suggests | rmarkdown |
| testit CRAN · 0.1.1 · 2026-05-30 | Suggests | testit |
| 검색 결과가 없습니다. | ||
| Package | Type | Spec |
|---|---|---|
| TidyConsultant 0.1.2 CRAN · 2026-05-30 | Imports | validata |
| 검색 결과가 없습니다. | ||
| Type | Packages |
|---|---|
| Imports | 1 |
NEWS code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} div.column{display: inline-block; vertical-align: top; width: 50%;} div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;} ul.task-list{list-style: none;} validata 0.1.1 validata 0.1.0 Added a NEWS.md file to track changes to the package.README code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} div.column{display: inline-block; vertical-align: top; width: 50%;} div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;} ul.task-list{list-style: none;} pre > code.sourceCode { white-space: pre; position: relative; } pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } pre > code.sourceCode > span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode > span { color: inherit; text-decoration: inherit; } div.sourceCode { margin: 1em 0; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { pre > code.sourceCode { white-space: pre-wrap; } pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } pre.numberSource code > span { position: relative; left: -4em; counter-increment: source-line; } pre.numberSource code > span > a:first-child::before { content: counter(source-line); position: relative; left: -1em; text-align: right; vertical-align: baseline; border: none; display: inline-block; -webkit-touch-callout: none; -webkit-user-select: none; -khtml-user-select: none; -moz-user-select: none; -ms-user-select: none; user-select: none; padding: 0 4px; width: 4em; color: #aaaaaa; } pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } div.sourceCode { } @media screen { pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } } code span.al { color: #ff0000; font-weight: bold; } /* Alert */ code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code span.at { color: #7d9029; } /* Attribute */ code span.bn { color: #40a070; } /* BaseN */ code span.bu { color: #008000; } /* BuiltIn */ code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code span.ch { color: #4070a0; } /* Char */ code span.cn { color: #880000; } /* Constant */ code span.co { color: #60a0b0; font-style: italic; } /* Comment */ code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code span.do { color: #ba2121; font-style: italic; } /* Documentation */ code span.dt { color: #902000; } /* DataType */ code span.dv { color: #40a070; } /* DecVal */ code span.er { color: #ff0000; font-weight: bold; } /* Error */ code span.ex { } /* Extension */ code span.fl { color: #40a070; } /* Float */ code span.fu { color: #06287e; } /* Function */ code span.im { color: #008000; font-weight: bold; } /* Import */ code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ code span.kw { color: #007020; font-weight: bold; } /* Keyword */ code span.op { color: #666666; } /* Operator */ code span.ot { color: #007020; } /* Other */ code span.pp { color: #bc7a00; } /* Preprocessor */ code span.sc { color: #4070a0; } /* SpecialChar */ code span.ss { color: #bb6688; } /* SpecialString */ code span.st { color: #4070a0; } /* String */ code span.va { color: #19177c; } /* Variable */ code span.vs { color: #4070a0; } /* VerbatimString */ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ validata The goal of validata is to provide functions for validating the structure and properties of data frames. Installation You can install the released version of validata from CRAN with: install.packages ( "validata" ) And the development version from GitHub with: # install.packages("devtools") devtools :: install_github ( "Harrison4192/validata" )Help for package validata const macros = { "\\R": "\\textsf{R}", "\\mbox": "\\text", "\\code": "\\texttt"}; function processMathHTML() { var l = document.getElementsByClassName('reqn'); for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); } return; } Package {validata} Contents validata-package %>% confirm_distinct confirm_mapping confirm_overlap confirm_overlap_internal confirm_strlen data_mode determine_distinct determine_mapping determine_overlap diagnose diagnose_category diagnose_missing diagnose_numeric mode_fn mode_pct n_dupes sample_data1 top_n_vals view_missing Title: Validate Data Frames Version: 0.1.1 Maintainer: Harrison Tietze <Harrison4192@gmail.com> Description: Functions for validating the structure and properties of data frames. Answers essential questions about a data set after initial import or modification. What are the unique or missing values? What columns form a primary key? What are the properties of the numeric or categorical columns? What kind of overlap or mapping exists between 2 columns? License: MIT + file LICENSE URL: https://harrison4192.github.io/validata/ , https://github.com/Harrison4192/validata BugReports: https://github.com/Harrison4192/validata/issues Encoding: UTF-8 LazyData: true RoxygenNote: 7.3.3 Imports: dplyr, stringr, janitor, rlang, tidyselect, purrr, magrittr, tidyr, tibble, gtools, listviewer, data.table, scales, utils, framecleaner, badger, rlist Suggests: knitr, rmarkdown, testit VignetteBuilder: knitr Depends: R (≥ 2.10) NeedsCompilation: no Packaged: 2026-02-25 14:06:29 UTC; harrisontietze Author: Harrison Tietze [aut, cre] Repository: CRAN Date/Publication: 2026-02-26 10:00:02 UTC validata: Validate Data Frames Description Functions for validating the structure and properties of data frames. Answers essential questions about a data set after initial import or modification. What are the unique or missing values? What columns form a primary key? What are the properties of the numeric or categorical columns? What kind of overlap or mapping exists between 2 columns? Author(s) Maintainer : Harrison Tietze Harrison4192@gmail.com See Also Useful links: https://harrison4192.github.io/validata/ https://github.com/Harrison4192/validata Report bugs at https://github.com/Harrison4192/validata/issues Pipe operator Description See magrittr:: %>% for details. Usage lhs %>% rhs Confirm Distinct Description Confirm whether the rows of a data frame can be uniquely identified by the keys in the selected columns. Also reports whether the dataframe has duplicates. If so, it is best to remove duplicates and re-run the function. Usage confirm_distinct(.data, ...) Arguments .data A dataframe ... (ID) columns Value a Logical value invisibly with description printed to console Examples iris %>% confirm_distinct(Species, Sepal.Width) Confirm structural mapping between 2 columns Description The mapping between elements of 2 columns can have 4 different relationships: one - one, one - many, many - one, many - many. This function returns a view of the mappings by row, and prints a summary to the console. Usage confirm_mapping(.data, col1, col2, view = T) Arguments .data a data frame col1 column 1 col2 column 2 view View results? Value A view of mappings. Also returns the view as a data frame invisibly. Examples iris %>% confirm_mapping(Species, Sepal.Width, view = FALSE) Confirm Overlap Description Prints a venn-diagram style summary of the unique value overlap between two columns and also invisibly returns a dataframe that can be assigned to a variable and queried with the overlap helpers. The helpers can return values that appeared only the first col, second col, or both cols. Usage confirm_overlap(vec1, vec2, return_tibble = F) co_find_only_in_1(co_output) co_find_only_in_2(co_output) co_find_in_both(co_output) Arguments vec1 vector 1 vec2 vector 2 return_tibble logical. If TRUE, returns a tibble. otherwise by default returns the database invisibly to be queried by helper functions. co_output dataframe output from confirm_overlap Value tibble. overlap summary or overlap table Examples confirm_overlap(iris$Sepal.Width, iris$Sepal.Length) -> iris_overlap iris_overlap iris_overlap %>% co_find_only_in_1() iris_overlap %>% co_find_only_in_2() iris_overlap %>% co_find_in_both() Confirm Overlap internal Description A venn style summary of the overlap in unique values of 2 vectors Usage confirm_overlap_internal(vec1, vec2) Arguments vec1 vector 1 vec2 vector 2 Value 1 row tibble Examples confirm_overlap(iris$Sepal.Width, iris$Sepal.Length) confirm string length Description returns a count table of string lengths for a character column. The helper function choose_strlen filters dataframe for rows containing specific string length for the specified column. Usage confirm_strlen(mdb, col) choose_strlen(cs_output, len) Arguments mdb dataframe col unquoted column cs_output dataframe. output from confirm_strlen len integer vector. Value prints a summary and returns a dataframe invisibly dataframe with original columns, filtered to the specific string length Examples iris %>% tibble::as_tibble() %>% confirm_strlen(Species) -> iris_cs_output iris_cs_output iris_cs_output %>% choose_strlen(6) data_mode Description data_mode Usage data_mode(x, prop = TRUE) Arguments x vector prop show frequency as ratio? default T Value named double of length 1 Automatically determine primary key Description Uses confirm_distinct in an iterative fashion to determine the primary keys. Usage determine_distinct(df, ..., listviewer = TRUE) Arguments df a data frame ... columns or a tidyselect specification. defaults to everything listviewer logical. defaults to TRUE to view output using the listviewer package Details The goal of this function is to automatically determine which columns uniquely identify the rows of a dataframe. The output is a printed description of the combination of columns that form unique identifiers at each level. At level 1, the function tests if individual columns are primary keys At level 2, the function tests n C 2 combinations of columns to see if they form primary keys. The final level is testing all columns at once. For completely unique columns, they are recorded in level 1, but then dropped from the data frame to facilitate the determination of multi-column primary keys. If the dataset contains duplicated rows, they are eliminated before proceeding. Value list Examples sample_data1 %>% head ## on level 1, each column is tested as a unique identifier. the VAL columns have no ## duplicates and hence qualify, even though they normally would be considered as IDs ## on level 3, combinations of 3 columns are tested. implying that ID_COL 1,2,3 form a unique key ## level 2 does not appear, implying that combinations of any 2 ID_COLs do not form a unique key sample_data1 %>% determine_distinct(listviewer = FALSE) Determine pairwise structural mappings Description Determine pairwise structural mappings Usage determine_mapping(df, ..., listviewer = TRUE) Arguments df a data frame ... columns or a tidyselect specification listviewer logical. defaults to TRUE to view output using the listviewer package Value description of mappings Examples iris %>% determine_mapping(listviewer = FALSE) Determine Overlap Description Uses confirm_overlap in a pairise fashion to see venn style comparison of unique values between the columns chosen by a tidyselect specification. Usage determine_overlap(db, ...) Arguments db a data frame ... tidyselect specification. Default being everything. Value tibble Examples iris %>% determine_overlap() diagnose Description Pipe in a dataframe to return a diagnosis of its missing and unique values for each columns. Default behavior is to diagnose all columns, but a subset can be specified in the dots with tidyselect. Usage diagnose(df, ...) Arguments df dataframe ... tidyselect Details this function is inspired by the excellent dlookr package. It takes a dataframe and returns a summary of uniSee magrittr::[magrittr:pipe]%>% for details.
lhs %>% rhsConfirm whether the rows of a data frame can be uniquely identified by the keys in the selected columns. Also reports whether the dataframe has duplicates. If so, it is best to remove duplicates and re-run the function.
confirm_distinct(.data, ...)iris %>% confirm_distinct(Species, Sepal.Width)The mapping between elements of 2 columns can have 4 different relationships: one - one, one - many, many - one, many - many. This function returns a view of the mappings by row, and prints a summary to the console.
confirm_mapping(.data, col1, col2, view = T)iris %>% confirm_mapping(Species, Sepal.Width, view = FALSE)Prints a venn-diagram style summary of the unique value overlap between two columns and also invisibly returns a dataframe that can be assigned to a variable and queried with the overlap helpers. The helpers can return values that appeared only the first col, second col, or both cols.
confirm_overlap(vec1, vec2, return_tibble = F) co_find_only_in_1(co_output) co_find_only_in_2(co_output) co_find_in_both(co_output)confirm_overlap(iris$Sepal.Width, iris$Sepal.Length) -> iris_overlap iris_overlap iris_overlap %>% co_find_only_in_1() iris_overlap %>% co_find_only_in_2() iris_overlap %>% co_find_in_both()A venn style summary of the overlap in unique values of 2 vectors
confirm_overlap_internal(vec1, vec2)confirm_overlap(iris$Sepal.Width, iris$Sepal.Length)returns a count table of string lengths for a character column. The helper function choose_strlen filters dataframe for rows containing specific string length for the specified column.
confirm_strlen(mdb, col) choose_strlen(cs_output, len)iris %>% tibble::as_tibble() %>% confirm_strlen(Species) -> iris_cs_output iris_cs_output iris_cs_output %>% choose_strlen(6)data_mode
data_mode(x, prop = TRUE)Uses confirm_distinct in an iterative fashion to determine the primary keys.
determine_distinct(df, ..., listviewer = TRUE)sample_data1 %>% head ## on level 1, each column is tested as a unique identifier. the VAL columns have no ## duplicates and hence qualify, even though they normally would be considered as IDs ## on level 3, combinations of 3 columns are tested. implying that ID_COL 1,2,3 form a unique key ## level 2 does not appear, implying that combinations of any 2 ID_COLs do not form a unique key sample_data1 %>% determine_distinct(listviewer = FALSE)Determine pairwise structural mappings
determine_mapping(df, ..., listviewer = TRUE)iris %>% determine_mapping(listviewer = FALSE)Uses confirm_overlap in a pairise fashion to see venn style comparison of unique values between the columns chosen by a tidyselect specification.
determine_overlap(db, ...)iris %>% determine_overlap()Pipe in a dataframe to return a diagnosis of its missing and unique values for each columns. Default behavior is to diagnose all columns, but a subset can be specified in the dots with tidyselect.
diagnose(df, ...)diagnose(iris)counts the distinct entries of categorical variables. The max_distinct argument limits the scope to categorical variables with a maximum number of unique entries, to prevent overflow.
diagnose_category(.data, ..., max_distinct = 5)diagnose_category(iris)faster than diagnose if emphasis is on diagnosing missing values. Also, only shows the columns with any missing values.
diagnose_missing(df, ...)diagnose_missing(tibble::tibble(x = c(NA, 1)))Inputs a dataframe and returns various summary statistics of the numeric columns. For example zeros returns the ratio of 0 values in that column. minus counts negative values and infs counts Inf values. Other rarer metrics are also returned that may be helpful for quick diagnosis or understanding of numeric data. mode returns the most common value in the column (chooses at random in case of tie) , and mode_ratio returns its frequency as a ratio of the total rows
diagnose_numeric(.data, ...)iris %>% diagnose_numeric() %>% print(width = Inf)returns the mode of a vector
mode_fn(x)c("b", "b", letters) %>% mode_fn()returns the mode of a vector with what percent of the data is the mode
mode_pct(x)c("b", "b", letters) %>% mode_pct()n_dupes
n_dupes(x)Sample Data
sample_data1top n vals
top_n_vals(x, top_n = 3)tibble::tibble(x = 1:10 %>% c(10,10,10,5,5)) -> t1 t1 %>% top_n_vals()Functions for validating the structure and properties of data frames. Answers essential questions about a data set after initial import or modification. What are the unique or missing values? What columns form a primary key? What are the properties of the numeric or categorical columns? What kind of overlap or mapping exists between 2 columns?
View rows of the dataframe where columns in the tidyselect specification contain missings by default, detects missings in any column. The result is by default displayed in the viewer pane. Can be returned as a tibble optionally.
view_missing(df, ..., view = TRUE)view_missing(tibble::tibble(x = c(NA, 1)), view = FALSE)| Repository | Version | Published | First seen | Last seen | Docs |
|---|---|---|---|---|---|
| CRAN | 0.1.1 | 2026-05-28 | 2026-05-30 |
표시할 OSV 데이터가 없습니다.
표시할 OpenAlex 데이터가 없습니다.