Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting and other dictation commands in ccr sequences #86

Closed
chilimangoes opened this issue Aug 11, 2015 · 14 comments
Closed

Formatting and other dictation commands in ccr sequences #86

chilimangoes opened this issue Aug 11, 2015 · 14 comments
Labels
Enhancement Enhancement of an existing feature Suggestion A suggestion is not defined enough to be a considered new feature or enhancement by OP

Comments

@chilimangoes
Copy link
Collaborator

An issue that I seem to run into quite a bit when using caster, is that I'll want to use a formatting command that contains one or more words from the ccr command list. For example, yesterday afternoon, I was editing an HTML file and I had the HTML vocabulary enabled. One of the class names used on the page was "inlineModalDialog", but when I said "gerrish gum bow in-line modal dialog", what I got was "inlineModal".

This doesn't happen all the time, but it's often enough to be really irritating, especially when searching through caster's source code to learn how it works, what the built-in commands are, and how I can extend it. Half the time, the word I'm searching for is a CCR command, so I end up having to resort to using the keyboard.

I was able to work around this problem by moving the text formatting commands in confignavigation.txt from the CCR section to the non-CCR section. However, there are times when it would be convenient to be able to mix text formatting commands in with CCR commands, and this got me thinking about the strategy proposed on this blog post:

http://handsfreecoding.org/?p=100

Basically, it involves having a separate category of commands that can only be invoked at the end of an utterance. This way, you can still invoke a series of CCR commands, followed by a command that includes dictation, but the dictation command must come last in the sequence.

This is something that I could probably work on, but I wanted to get your thoughts on the approach, and maybe brainstorm a little bit first.

synkarius added a commit that referenced this issue Aug 12, 2015
@synkarius
Copy link
Collaborator

I too am constantly annoyed by this problem. James's solution is extremely interesting; thank you for pointing it out. Actually, I like James's solution so much that I have implemented a version of it in the new develop branch, so that I can use it at work tomorrow. Please do not consider this to be an end to the conversation. It may be adequate, it may not; my attempt is basically my first thoughts on the subject.

The way I have it implemented here, any active CCR command can be used as the original or terminal command. This removes the need to read in and store the original and terminal commands separately. Do you think this approach works, or do we want the original and terminal commands to be a limited subset of the overall CCR command set?

@synkarius
Copy link
Collaborator

So for your example, now you would say, "terminal gerrish gum bow in-line modal dialog".

@synkarius synkarius added Enhancement Enhancement of an existing feature Suggestion A suggestion is not defined enough to be a considered new feature or enhancement by OP labels Aug 12, 2015
@chilimangoes
Copy link
Collaborator Author

Wow, that was quick! I'm still trying to wrap my head around how all the pieces of caster work, especially the CCR portion.

I was originally thinking of just putting dictation-containing sequences at the end, without the terminal trigger word, but keeping the trigger keyword might be better. On the one hand, I'm kind of a clumsy speaker and saying "gerrish gum bow some identifier" is already enough of a tongue twister without adding "terminal" at the start. But on the other hand, it's often nice to be able to say "gerrish gum bow some identifier ace equals ace" or to say "find" followed by "gerrish gum bow some identifier shock" to immediately trigger a search. This solution allows for that.

One idea I had was to have a main sequence, followed by the terminal sequence, followed by a very limited number of optional keywords, like ace and shock. But that might just be too much work and complexity for a minor gain in usability. Some of the other command sequences are getting easier to say, so I think I'll try this solution for a while and see if I can wrap my tongue around it.

I was a little confused by the "original" element that you added to the spec. What's the purpose of that? It looks to me like just the same as the main sequence, but with "original" tacked on to the end and non-repeatable.

@ghost
Copy link

ghost commented Aug 12, 2015

Just to chime in quick. On a note on usability. It's a mouthful to dictate each time the string of commands for formatting. Why not havet remember the last formatting command string and\or presets that can be saved?

The command "bow" would automatically assume the previous format that was last dictated. A command "bow #" could saved presets. #=A number The hope is to come up with an idea that minimizes the number and frequency of formatting commands

Regarding the subject at hand couldn't we force dragon into dictation mode during free dictation. The end of free dictation would be triggered by a short pause in dictation and reenable command mode. Then the formatting would be applied.

Or

Why can't we use ContextSeeker to limit ccr commands during free dictation?

Apologize I can't give examples or be more detailed but I'm on my android phone dictating.

@chilimangoes
Copy link
Collaborator Author

Regarding your first question, there's already a "format" command that I think does what you want - it remembers the last format you used. So for example, you can say:

"tie gum bow property one"
"ace equals ace"
"format property two"

And caster will output:

"PropertyOne = PropertyTwo"

One thing that would be nice, that I don't think is implemented from looking at the code, is to explicitly set the format mode without having to use the format mode in an utterance first. For example, in the above example I would like to be able to say "set mode tie gum" pause and then "format property one". Because like you said, the formatting commands are a mouthfull and I always have to pause and think after saying the format mode. I might take a crack at implementing this tomorrow.

Regarding your idea to switch to dictation mode, I don't think it'll work. I'm pretty sure Dragon doesn't allow us to programmatically switch to dictation mode in the middle of an utterance. I also don't think it's possible to capture raw dictation in dragonfly rules to be able to do the post dictation formatting. AFAIK, dragonfly grammars only work in command mode (or with commands uttered in mixed mode).

I'd like to hear what @synkarius has to say about the use of ContextSeeker, as I haven't quite wrapped my head around that class yet.

@synkarius
Copy link
Collaborator

I put in the "original" extra for symmetry. It so happens that the text formatting commands have the dictation extra at the end, but you can easily imagine versions of them which have it at the beginning, or other such commands.

I agree that this is getting to be a lot to speak for a single text format. I have two thoughts on this. The first is that CCR commands, more than regular commands have to be phonetically distinct, and the easiest way to achieve that is making them longer. The second is that #72 will let you override specs.

@zone22, unfortunately @chilimangoes is completely right on both counts. You can't switch modes during execution, and ContextSeeker only has control of other commands, none of free dictation.

@synkarius
Copy link
Collaborator

@zone22 -- That said, it would be possible to create a command which counts the number of words in the last (free dictation) utterance, deletes that many words back, and then prints a formatted version of them.

@synkarius
Copy link
Collaborator

@zone22 -- You could do that command with pure Dragonfly. You just have to use a RecognitionHistory. Caster's RecognitionHistory is control.nexus().history.

@chilimangoes -- I think that set formatting mode command is a good idea.

@chilimangoes
Copy link
Collaborator Author

I think streamlining the entry of formatted identifiers is one of the biggest potential areas for improving usability of day-to-day programming work because it's a huge percentage of the code we write.

I agree with both your points about the reasons for the complexity of formatting commands. I also think there are some other ways we can improve usability without hurting recognition accuracy. The "format" command, combined with explicitly setting the text format, is one example. I also like the idea of being able to format the text after the fact, sort of like a "format that" command. Although rather than working on the previous dictation, I would vote to make it more generally applicable, by working on any arbitrary range of text (e.g. "queue lease four" followed by "format that tie gum"). I imagine this could be accomplished by transferring the currently selected text to the clipboard, formatting it, and pasting back. Unfortunately, since I use DNS 13 I'm not sure how often I would be able to use this feature. :(

Addressing the cognitive load required to remember all the words you want output in the formatted text is another potential area for improvement. For example, if I'm trying to output SomeLongIdentifier, I will frequently pause or stutter somewhere in the middle saying "tie gum bow some long identifier" and then I end up having to say multiple formatting commands to create a single identifier. We could have a command that opens up a dialog box specifically designed for dictating identifiers. For example, saying "identifier dialog tie gum bow" would bring up a dialog with a text box that allows me to dictate some text, and when I click okay or say an equivalent terminator keyword, the text would be formatted and transferred to the program I was working in. If this sounds like something others would like, I can open up a separate issue for it.

@synkarius
Copy link
Collaborator

Regarding "format that", context.read_selected_without_altering_clipboard() could probably be of some use.

Regarding the dialog box idea, have a look at BoxAction. It's not complete, but the version of it in 0.5.2 (which can do settings.QTYPE_DEFAULT -type tk windows so far) should be adequate to implement the idea you described above. There's an example of it in dev.py. Eventually, I'll be switching all of the pop-up and dialog windows over to BoxAction, rather than the disjointed way they work now.

@synkarius
Copy link
Collaborator

Also, I completely agree with you about streamlining the entry of identifiers. At work, I find myself using the copy and paste commands far too often because Caster is lacking in this area.

@chilimangoes
Copy link
Collaborator Author

Thanks, I'll check those out.

@chilimangoes
Copy link
Collaborator Author

@zone22, I've added a PR (#87) with an explicit "set format" command, as discussed above. After using this for a few hours today, I'm really liking it.

@ghost
Copy link

ghost commented Aug 21, 2015

@chilimangoes
Excellent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Enhancement of an existing feature Suggestion A suggestion is not defined enough to be a considered new feature or enhancement by OP
Projects
None yet
Development

No branches or pull requests

2 participants