Please consider treating datagrids as inputs as well as outputs ? #1560

mdaeron · 2024-07-19T11:08:18Z

This is half a feature request, half a question about how shiny conceptualizes data input.

I'd like to use shiny to help my researcher colleagues process some analytical data. Because most of the potential users are not proficient coders, the most accessible/realistic way to input data should be copy-and-paste from a spreadsheet app into the shiny app. Pasting into an editable datagrid does not appear to be currently implemented, and the fact that DataGrid is explicitly defined as an output makes me wonder if my use case is at odds with some core philosophy of shiny, such as "input data resides on the server", or "input data must be uploaded from a file".

My two cents: pasting into a datagrid offers at least two benefits over simply pasting data into a text area: (a) because any problems in formatting/cell mismatches are immediately apparent, the user can easily perform some kind of visual data validation; (b) one thing end-users frequently mistype is column headers; this problem disappears if the headers are already specified by a datagrid input.

So, is it possible that datagrids will some day be treated as full-blown inputs, or would that break things and/or go against some core conceptualization of py-shiny?

schloerke · 2024-07-22T16:41:11Z

Related - Add ability to update cells and data from the server: #1449

Pasting into an editable datagrid does not appear to be currently implemented ...

Correct. As of v1.0.0, this is not implemented. (But it could be in the future! Will expand on this in followup comment)

... and the fact that DataGrid is explicitly defined as an output makes me wonder if my use case is at odds with some core philosophy of shiny, such as "input data resides on the server", or "input data must be uploaded from a file".

We already have the ability to edit a data frame within the browser via a single cell edit.

The current approach has been:

The data starts from the return value from the function decorated with @render.data_frame.
An edit is created in the browser and the patch is sent to the server
The server processed the patch (approving it by not throwing an error and returning it and any other updated patches)
The browser receives the approved patches and inserts them into the displayed table

We can access the updated data via .data_view() and the original data via .data()

is at odds with some core philosophy of shiny

It is definitely different as updates are happening to the rendered output after the initial rendering has occurred.

But if the decorated @render.data_frame method is executed again (due to reactivity), then (currently) the user edits, user sorting, and user filtering is dropped. Everything is reset. This prevents us from entering a permanent dirty state where a previous interaction has compromised the current state that we can not escape.

such as "input data resides on the server", or "input data must be uploaded from a file"

@render.data_frame needs the data to be returned from it's decorated function. It does not care how it gets there... from a file or from the server itself.

schloerke · 2024-07-22T16:59:28Z

Questions I run into with pasting data:

Situation: Pasted data has new columns
- ... Should they be string types on the server?
- What if they look like numbers? Should they be converted then?
- Can the server reject the new columns but keep the cells in the current columns? How can we accept and reject in a single request.
  - Currently it only returns an accepted set or the originally edited cells fail.
  - Maybe change the return patch structure to allow for error: str?
- What new column names should be used?
  - I do not believe they're included in the pasted data
Situation: If the column type can be determined from the pasted data...
- Should we auto reject if the types are a mismatch?
- Should .set_patches_fn handle everything?
Situation: Pasted data has new rows
- Add new rows! (This isn't contentious, but will need to be addressed)
- What value do we use for the columns that did not receive data?
Situation: Remove columns
- If you can add columns, it quickly extends to being able to remove columns
Situation: Remove rows
- If the newly pasted data is smaller, should the row count be trimmed?
- If not, should they be None values on the server? (What value is used as a placeholder value? Will it conflict with the column type (polars)?)

On the server side, we can get pretty far with .update_cell_value() and .update_data() proposed in #1449 .

By only updating cells, we can accept/reject immediately.
When updating data, it will come with a new set of columns, rows, and data types all at once. No questions around missing column names or missing cell values.

Goal: Reduce this question set . Any thoughts or approaches appreciated!

github-actions bot added the needs-triage label Jul 19, 2024

schloerke added enhancement New feature or request data frame Related to @render.data_frame and removed needs-triage labels Jul 22, 2024

schloerke mentioned this issue Aug 26, 2024

Epic -- Data frames for v1.1.0 #1639

Open

38 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please consider treating datagrids as inputs as well as outputs ? #1560

Please consider treating datagrids as inputs as well as outputs ? #1560

mdaeron commented Jul 19, 2024

schloerke commented Jul 22, 2024

schloerke commented Jul 22, 2024

Please consider treating datagrids as inputs as well as outputs ? #1560

Please consider treating datagrids as inputs as well as outputs ? #1560

Comments

mdaeron commented Jul 19, 2024

schloerke commented Jul 22, 2024

schloerke commented Jul 22, 2024