prepare_for_detach and ongoing displaced stepping

I noticed that "detach" while a program was running sometimes resulted in the process crashing. I tracked it down to this change to prepare_for_detach in commit 187b041 ("gdb: move displaced stepping logic to gdbarch, allow starting concurrent displaced steps"): /* Is any thread of this process displaced stepping? If not, there's nothing else to do. */ - if (displaced->step_thread == nullptr) + if (displaced_step_in_progress (inf)) return; The problem above is that the condition was inadvertently flipped. It should have been: if (!displaced_step_in_progress (inf)) So I fixed it, and wrote a testcase to exercise it. The testcase has a number of threads constantly stepping over a breakpoint, and then GDB detaches the process, while threads are running and stepping over the breakpoint. And then I was surprised that my testcase would hang -- GDB would get stuck in an infinite loop in prepare_for_detach, here: while (displaced_step_in_progress (inf)) { ... What is going on is that since we now have two displaced stepping buffers, as one displaced step finishes, GDB starts another, and there's another one already in progress, and on and on, so the displaced_step_in_progress condition never turns false. This happens because we go via the whole handle_inferior_event, which tries to start new step overs when one finishes. And also because while we remove breakpoints from the target before prepare_for_detach is called, handle_inferior_event ends up calling insert_breakpoints via e.g. keep_going. Thinking through all this, I came to the conclusion that going through the whole handle_inferior_event isn't ideal. A _lot_ is done by that function, e.g., some thread may get a signal which is passed to the inferior, and gdb decides to try to get over the signal handler, which reinstalls breakpoints. Or some process may exit. We can end up reporting these events via normal_stop while detaching, maybe end up running some breakpoint commands, or maybe even something runs an inferior function call. Etc. All this after the user has already declared they don't want to debug the process anymore, by asking to detach. I came to the conclusion that it's better to do the minimal amount of work possible, in a more controlled fashion, without going through handle_inferior_event. So in the new approach implemented by this patch, if there are threads of the inferior that we're detaching in the middle of a displaced step, stop them, and cancel the displaced step. This is basically what stop_all_threads already does, via wait_one and (the now factored out) handle_one, so I'm reusing those. gdb/ChangeLog: * infrun.c (struct wait_one_event): Move higher up. (prepare_for_detach): Abort in-progress displaced steps instead of letting them complete. (handle_one): If the inferior is detaching, don't add the thread back to the global step-over chain. (restart_threads): Don't restart threads if detaching. (handle_signal_stop): Remove inferior::detaching reference.
tromey · Feb 3, 2021 · 8ff5313 · 8ff5313
1 parent 9147506
commit 8ff5313
Show file tree

Hide file tree

Showing 2 changed files with 77 additions and 57 deletions.
diff --git a/gdb/ChangeLog b/gdb/ChangeLog
@@ -1,3 +1,13 @@
+2021-02-03  Pedro Alves  <pedro@palves.net>
+
+	* infrun.c (struct wait_one_event): Move higher up.
+	(prepare_for_detach): Abort in-progress displaced steps instead of
+	letting them complete.
+	(handle_one): If the inferior is detaching, don't add the thread
+	back to the global step-over chain.
+	(restart_threads): Don't restart threads if detaching.
+	(handle_signal_stop): Remove inferior::detaching reference.
+
 2021-02-03  Pedro Alves  <pedro@palves.net>
 
 	* infrun.c (prepare_for_detach): Don't release scoped_restore

diff --git a/gdb/infrun.c b/gdb/infrun.c
@@ -3551,6 +3551,22 @@ do_target_wait (ptid_t wait_ptid, execution_control_state *ecs,
   return false;
 }
 
+/* An event reported by wait_one.  */
+
+struct wait_one_event
+{
+  /* The target the event came out of.  */
+  process_stratum_target *target;
+
+  /* The PTID the event was for.  */
+  ptid_t ptid;
+
+  /* The waitstatus.  */
+  target_waitstatus ws;
+};
+
+static bool handle_one (const wait_one_event &event);
+
 /* Prepare and stabilize the inferior for detaching it.  E.g.,
    detaching while a thread is displaced stepping is a recipe for
    crashing it, as nothing would readjust the PC out of the scratch
@@ -3561,53 +3577,60 @@ prepare_for_detach (void)
 {
   struct inferior *inf = current_inferior ();
   ptid_t pid_ptid = ptid_t (inf->pid);
-
-  /* Is any thread of this process displaced stepping?  If not,
-     there's nothing else to do.  */
-  if (displaced_step_in_progress (inf))
-    return;
-
-  infrun_debug_printf ("displaced-stepping in-process while detaching");
+  scoped_restore_current_thread restore_thread;
 
   scoped_restore restore_detaching = make_scoped_restore (&inf->detaching, true);
 
-  while (displaced_step_in_progress (inf))
+  /* Remove all threads of INF from the global step-over chain.  We
+     want to stop any ongoing step-over, not start any new one.  */
+  thread_info *next;
+  for (thread_info *tp = global_thread_step_over_chain_head;
+       tp != nullptr;
+       tp = next)
     {
-      struct execution_control_state ecss;
-      struct execution_control_state *ecs;
+      next = global_thread_step_over_chain_next (tp);
+      if (tp->inf == inf)
+	global_thread_step_over_chain_remove (tp);
+    }
 
-      ecs = &ecss;
-      memset (ecs, 0, sizeof (*ecs));
+  if (displaced_step_in_progress (inf))
+    {
+      infrun_debug_printf ("displaced-stepping in-process while detaching");
 
-      overlay_cache_invalid = 1;
-      /* Flush target cache before starting to handle each event.
-	 Target was running and cache could be stale.  This is just a
-	 heuristic.  Running threads may modify target memory, but we
-	 don't get any event.  */
-      target_dcache_invalidate ();
+      /* Stop threads currently displaced stepping, aborting it.  */
 
-      do_target_wait (pid_ptid, ecs, 0);
+      for (thread_info *thr : inf->non_exited_threads ())
+	{
+	  if (thr->displaced_step_state.in_progress ())
+	    {
+	      if (thr->executing)
+		{
+		  if (!thr->stop_requested)
+		    {
+		      target_stop (thr->ptid);
+		      thr->stop_requested = true;
+		    }
+		}
+	      else
+		thr->resumed = false;
+	    }
+	}
 
-      if (debug_infrun)
-	print_target_wait_results (pid_ptid, ecs->ptid, &ecs->ws);
+      while (displaced_step_in_progress (inf))
+	{
+	  wait_one_event event;
 
-      /* If an error happens while handling the event, propagate GDB's
-	 knowledge of the executing state to the frontend/user running
-	 state.  */
-      scoped_finish_thread_state finish_state (inf->process_target (),
-					       minus_one_ptid);
+	  event.target = inf->process_target ();
+	  event.ptid = do_target_wait_1 (inf, pid_ptid, &event.ws, 0);
 
-      /* Now figure out what to do with the result of the result.  */
-      handle_inferior_event (ecs);
+	  if (debug_infrun)
+	    print_target_wait_results (pid_ptid, event.ptid, &event.ws);
 
-      /* No error, don't finish the state yet.  */
-      finish_state.release ();
+	  handle_one (event);
+	}
 
-      /* Breakpoints and watchpoints are not installed on the target
-	 at this point, and signals are passed directly to the
-	 inferior, so this must mean the process is gone.  */
-      if (!ecs->wait_some_more)
-	error (_("Program exited while detaching"));
+      /* It's OK to leave some of the threads of INF stopped, since
+	 they'll be detached shortly.  */
     }
 }
 
@@ -4361,20 +4384,6 @@ poll_one_curr_target (struct target_waitstatus *ws)
   return event_ptid;
 }
 
-/* An event reported by wait_one.  */
-
-struct wait_one_event
-{
-  /* The target the event came out of.  */
-  process_stratum_target *target;
-
-  /* The PTID the event was for.  */
-  ptid_t ptid;
-
-  /* The waitstatus.  */
-  target_waitstatus ws;
-};
-
 /* Wait for one event out of any target.  */
 
 static wait_one_event
@@ -4558,9 +4567,10 @@ mark_non_executing_threads (process_stratum_target *target,
 /* Handle one event after stopping threads.  If the eventing thread
    reports back any interesting event, we leave it pending.  If the
    eventing thread was in the middle of a displaced step, we
-   cancel/finish it.  Returns true if there are no resumed threads
-   left in the target (thus there's no point in waiting further),
-   false otherwise.  */
+   cancel/finish it, and unless the thread's inferior is being
+   detached, put the thread back in the step-over chain.  Returns true
+   if there are no resumed threads left in the target (thus there's no
+   point in waiting further), false otherwise.  */
 
 static bool
 handle_one (const wait_one_event &event)
@@ -4663,7 +4673,8 @@ handle_one (const wait_one_event &event)
 		 target_pid_to_str (t->ptid).c_str ());
 
 	      t->control.trap_expected = 0;
-	      global_thread_step_over_chain_enqueue (t);
+	      if (!t->inf->detaching)
+		global_thread_step_over_chain_enqueue (t);
 	    }
 	}
       else
@@ -4687,7 +4698,8 @@ handle_one (const wait_one_event &event)
 	    {
 	      /* Add it back to the step-over queue.  */
 	      t->control.trap_expected = 0;
-	      global_thread_step_over_chain_enqueue (t);
+	      if (!t->inf->detaching)
+		global_thread_step_over_chain_enqueue (t);
 	    }
 
 	  regcache = get_thread_regcache (t);
@@ -6111,7 +6123,6 @@ handle_signal_stop (struct execution_control_state *ecs)
   if (random_signal)
     {
       /* Signal not for debugging purposes.  */
-      struct inferior *inf = find_inferior_ptid (ecs->target, ecs->ptid);
       enum gdb_signal stop_signal = ecs->event_thread->suspend.stop_signal;
 
       infrun_debug_printf ("random signal (%s)",
@@ -6124,8 +6135,7 @@ handle_signal_stop (struct execution_control_state *ecs)
 	 to remain stopped.  */
       if (stop_soon != NO_STOP_QUIETLY
 	  || ecs->event_thread->stop_requested
-	  || (!inf->detaching
-	      && signal_stop_state (ecs->event_thread->suspend.stop_signal)))
+	  || signal_stop_state (ecs->event_thread->suspend.stop_signal))
 	{
 	  stop_waiting (ecs);
 	  return;