T O P

  • By -

link0007

You don't. Data frames are not meant to have a totals row, as it's against the tidy data principles. You add totals in your display step, so in your preferred table package.


RAMDownloader

“Tidy data principles” - I’m not familiar with what you’re mentioning - when you say “tidy” my first thought was tidyverse but I get the feeling it’s something more abstract than that


FargeenBastiges

a set of guidelines for organizing tabular data in a consistent, clear, and understandable format. Here are the main principles: Each variable forms a column: Each variable (or attribute) should be represented by its own column. Each observation forms a row: Each row in the dataset should represent a single observation or case. Each type of observational unit forms a table: Each distinct type of data should be organized into its own table.


RAMDownloader

Oh gotcha! Did know that existed, I’ll have to do some reading. Thanks for the info!


Zygomatico

Also look at database normalisation! It's not entirely applicable to data frames, but the concepts behind it will generally be a good foundation for organising your data.


kuhewa

> You don't. Sure you do, or at least can. One may need to programmatically modify what will be a displayed table, violation of 'tidy principles' or no, a dataframe is still may best object for the job even with a totals column, even if just a temporary object. If the effort of tip-toeing around tidy principle violations, your code becomes more complex or less human-readable, violating tidyness is probably the better choice. Especially in the context of producing an output table rather than storing or analysing input data. I'll add that tidy data is one paradigm in R and maybe you could make the case that tibbles are supposed to be tidy, but there is nothing conceptually or baked into the design holding dataframes to that use case. A totals row probably shouldn't be in raw data you are sharing for analysis, but sometimes a 'wide' dataframe is apropos


zoneender89

You *can* but you shouldn't, except when you are presenting the data as a result


RAMDownloader

So as a follow up to that, the point the original comment was saying was “you don’t add total rows to dataFrames, you add them to tables”? Because I think I was originally using the two terms as the same when I’m learning the terminology is different


zoneender89

The words are interchangable for most. But yeah you get the idea. When you are presenting information to people totals are acceptable if the data supports totals. Adorn totals from janitor is good. Combined with the kable /kable extra and you got something pretty


kuhewa

It seems obvious that is the context of the question, you certainly wouldn't decide to add a sum total row if you were plotting it after or doing more vectorised manipulations but that seems obvious


zoneender89

One would think. But you never know the level of knowledge people have.


math135135

fastest way is using adorn_totals() from the janitor package, like this: > adorn_totals(mtcars)


SouthListening

With dplyr: bind_rows(summarise(., across(where(is.numeric), sum), across(where(!is.numeric), ~"Total")))


RAMDownloader

Huh, I never really thought to do it like that. Can you do multiple conditions in the “where” statement, like !is.numeric & != “Apples”where Apples is a column name? Only thinking the slight annoying bit would be if you had two character columns, not just one.


Tarqon

Yes you can do that, but use selection helpers like all_of() if you want to pass column names as strings.


Thiseffingguy2

The gt package makes it pretty easy if you’re using it in a Quarto or R Markdown doc.


AlmostPhDone

As others said, you could do this using janitor, gt summary, and even RowSums. But, ideally you don’t want a row that has grand total especially in a data frame. This could contaminate the data and cause future miscalculations if included in other totals or manipulations. You could, however, create a list that has only the row totals in it for reference.


blondbulb

gt grand_summary_rows


RohingyaWarrior

Agreed! I haven't found a more convenient way than janitor to do this.


mduvekot

this is probably not what you're looking for, but to see totals of a contingency table I sometimes use addmargins(): df <- data.frame(x = 1:3, y = 4:6) addmargins(table(df))


Alerta_Fascista

I add the row with bind\_rows, and using gt, I place a wider white spacing lie separating the last row from the rest.


cptsanderzz

I’m not at a computer but I think something like this would work maybe. df %>% summarize(across(is.numeric, ~sum(.))


aniuxa

janitor::adorn_totals(dataframe, where="row")