Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anticipate expectations for dvc commands #26

Closed
ryanraposo opened this issue Dec 16, 2020 · 7 comments
Closed

Anticipate expectations for dvc commands #26

ryanraposo opened this issue Dec 16, 2020 · 7 comments

Comments

@ryanraposo
Copy link

ryanraposo commented Dec 16, 2020

DVC commands in VS Code will manifest in different ways.

Tasks

Here's a case-study for Tasks, using rake (make-like build utility for Ruby) as an example.

An item from a typical Rakefile:

task :uninstall do
    puts 'Uninstalling mypackage'
    # `sudo apt-get remove -y mypackage`
end

Task Provider

Tasks are detected or inferred by extensions using Task providers, then served up to VS Code on request.

Here's a pattern overview for a Task provider suited to rake:

  • Instantiated with a workspace root path.
  • Promises Tasks.
  • Watches the workspace Rakefile.
const fileWatcher = vscode.workspace.createFileSystemWatcher(pattern);
  • If Rakefile is changed, deleted, or created; the Promise is cancelled.
  • Runs rake -AT -f Rakefile to list all items in Rakefile.
  • Output is routed to a dedicated vscode.OutputChannel channel:
this.channel = vscode.window.createOutputChannel('Rake Auto Detection');
  • Uses pattern matching on output and creates a vscode.Task for each:
const task = new vscode.Task(
  kind, 
  workspaceFolder,
  taskName,
  'rake',
  new vscode.ShellExecution(`rake ${taskName}`)
);

User Experience

When working in the relevant workspace, user enjoys integrated access to all Rakefile tasks, like 'Uninstall' in the example above.

(Will update with examples, benefits, and access points for integrated Tasks.)

Snippets

Commands with 'WHEN' conditions:

{
  "contributes": {
    "menus": {
      "explorer/context": [
        {
          "when": "resourceFilename == params.yaml",
          "command": "dvc.runExperiment",
          "group": "navigation"
        }
      ]
    }
  }

Run shell commands and handle output:

function exec(command: string, options: cp.ExecOptions): Promise<{ stdout: string; stderr: string }> {
	return new Promise<{ stdout: string; stderr: string }>((resolve, reject) => {
		cp.exec(command, options, (error, stdout, stderr) => {
			if (error) {
				reject({ error, stdout, stderr });
			}
			resolve({ stdout, stderr });
		});
	});
}

const commandLine = 'rake -AT -f Rakefile';
try {
const { stdout, stderr } = await exec(commandLine, { cwd: folderString });
			if (stderr && stderr.length > 0) {
				getOutputChannel().appendLine(stderr);
				getOutputChannel().show(true);
			}
			if (stdout) { doThings() }
}

Questions

One job is making sure our vscode-dvc feels like an extension, but the other is making sure that if it begs an extension, we deliver. When it comes to commands:

  • What would that look like?

  • What are the nuances of DVC and its workflows that would would make a user over-joyed to see it in listed in the marketplace?

  • Aside from visualizations, what would instantly occur to users? "Now I'll be able to..." or "I bet that means ___ will be so easy/accessible!"

  • What are the accepted variables in a DVC workflow?

These are the things that I can't see very well. Please tell me what commands mean to a DVC user :)

@shcheklein
Copy link
Member

I guess a good proxy is what Git lens, Git history, and native Git plugins do for VS Code. Yes, you could always jump into CLI and do the same. But if it's integrated well it gives:

  1. An experience where you can stay in one place, no jumps
  2. It hides complexity (do you always remember what arguments to pass into a specific command to get a certain result in Git). E.g. in the experiments case we have table with some actions on it- it's way easier to navigate it in the webview vs CLI, it's easier to run commands from the context menu, etc, etc.
  3. It improves visibility - you see hidden files, you see the table, you see the plots and things are running

Regarding tasks vs commands.

DVC has a file that is exactly similar to a makefile (dvc.yaml). dvc repro or dvc run exp runs that file. Eventually we'll have a way to visualize the list of stages from that file, dvc repro already can run only a specific subset of targets from the file.

I'm not sure still what is the difference - tasks vs commands in your question?

@ryanraposo
Copy link
Author

ryanraposo commented Dec 16, 2020

I guess a good proxy is what Git lens, Git history, and native Git plugins do for VS Code. Yes, you could always jump into CLI and do the same. But if it's integrated well it gives:

1. An experience where you can stay in one place, no jumps

2. It hides complexity (do you always remember what arguments to pass into a specific command to get a certain result in Git). E.g. in the experiments case we have table with some actions on it- it's way easier to navigate it in the webview vs CLI, it's easier to run commands from the context menu, etc, etc.

3. It improves visibility - you see hidden files, you see the table, you see the plots and things are running

Great points.

Regarding tasks vs commands.

DVC has a file that is exactly similar to a makefile (dvc.yaml). dvc repro or dvc run exp runs that file. Eventually we'll have a way to visualize the list of stages from that file, dvc repro already can run only a specific subset of targets from the file.

That clears that up, thanks.

I'm not sure still what is the difference - tasks vs commands in your question?

Its a hard question. They're not mutually exclusive, for example a command can be used to run a task. Commands can do anything, but only a Task is a Task, and the mechanisms behind serving them are different. Without knowing the API intimately, it might help to consider that Task Providers are similar to Language Providers, in that there is high emphasis on context.

Since there is a distinction API (class/pattern) wise, its an important discussion to have. I've just found it to be a theme in my own learning lately; discerning the place for Tasks. Learning DVC and answering the question in its context will just take some thought.

@rogermparent
Copy link
Contributor

Copied from #28:

We'll need a way to recognize and use Python virtual environments on these Terminal-based commands. DvcReader's child_process invocation does this with a naive check for .env/bin/dvc, running the path to that binary specifically instead of dvc on PATH.

I think the most optimal solution to this that won't break projects is to use VS Code's native settings API to define a few configurable options, for example:

  • DVC Binary File: basic to the point of being self-explanatory, changes the dvc binary that's executed for the extension's operations. We do a basic version of this in DvcReader, but without the VSCode API yet.
  • Virtual Environment Directory: all DVC projects I've seen use .env, but I've seen other Python projects use .venv and really it could be anything. This directory in isolation can be used both for activating on a shell or directly invoking the environment's dvc binary, but has some caveats on both options depending on how each project is configured.
  • Shell initialization command: command run when a DVC terminal is spun up, e.g. source .env/bin/activate or conda activate ./.env.

Same as my ideas on #22, the config setup may be better left for the future. However, adding the basic ability to use .env on a run like DvcReader does (ideally reusing the logic) would be a good idea to implement before merging, as even the demo project's README suggests using a virtual env.

@ryanraposo
Copy link
Author

ryanraposo commented Dec 17, 2020

We'll need a way to recognize and use Python virtual environments on these Terminal-based commands. DvcReader's child_process invocation does this with a naive check for .env/bin/dvc, running the path to that binary specifically instead of dvc on PATH.

Good insight/pointer. Sidenote: I'm used to conda, but that's maybe out of scope for now.

I think the most optimal solution to this that won't break projects is to use VS Code's native settings API to define a few configurable options, for example:

DVC Binary File

Great idea, reminds me of:

"python.pythonPath": "C:\\ProgramData\\Miniconda3""

Virtual Environment Directory

Also good. See:

"python.venvFolders": []

Shell initialization command_: command run when a DVC terminal is spun up, e.g. source .env/bin/activate or conda activate ./.env.

Solid. Array would be nice, as in:

"terminal.integrated.shellArgs.windows": [],

Same as my ideas on #22, the config setup may be better left for the future. However, adding the basic ability to use .env on a run like DvcReader does (ideally reusing the logic) would be a good idea to implement before merging, as even the demo project's README suggests using a virtual env.

I agree. Glad you mentioned it, though! And okay I'll keep that in mind.

Something to keep in mind also: we can check those settings (which would be fair especially in the case of Python).

@shcheklein
Copy link
Member

@rogermparent @ryanraposo I think it worth breaking a proper support for different Python env into a separate ticket with some of the details you discussed here guys. Good stuff. We'll absolutely need to figure this and make it solid before the release. Not a priority at the moment though.

@shcheklein
Copy link
Member

@ryanraposo

Its a hard question. They're not mutually exclusive, for example a command can be used to run a task. Commands can do anything, but only a Task is a Task, and the mechanisms behind serving them are different. Without knowing the API intimately, it might help to consider that Task Providers are similar to Language Providers, in that there is high emphasis on context.

not clear still to be honest. Are there any links, or examples to try to understand the flow?

@ryanraposo
Copy link
Author

ryanraposo commented Dec 18, 2020

not clear still to be honest. Are there any links, or examples to try to understand the flow?

@shcheklein Absolutely!! I'll get back to you.

A lot of it is based on user flow like you mentioned, so I'll fill you in on that perspective as we work through it. I'll focus on sharing my thought process alongside PRs, and update the OP with the essence of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants