Skip to content

MDC Troubleshooting Checklist

Dave Parfitt edited this page Nov 21, 2013 · 15 revisions
  • Are there any old patches in basho-patches?

  • Has each node in the source/sink cluster(s) been tuned?

  • Network connectivity verified

  • Static data

    • priority (urgent/high/low)
    • version of Riak EE
      • from sources AND sinks
    • version of replication: v2 | v3 | AAE
    • topology
      • how many clusters
      • bidirectional?
    • app.config config
      • from sources AND sinks
  • Command line status

    • colllected from the cluster_leaders on source AND sink
      • riak-repl status
      • riak-repl connections
      • riak-admin status
  • Erlang state
    • cluster_mgr state rp(sys:get_status(riak_core_cluster_manager)).
    • connection_mgr state rp(sys:get_status(riak_core_connection_manager)).
    • service_mgr state rp(sys:get_status(riak_core_service_manager)).
    • riak_repl2_leader state rp(sys:get_status(riak_repl2_leader_gs)).
    • riak_repl_leader_gs state rp(sys:get_status(riak_repl_leader_gs)).

Fullsync issues

  • riak_repl2_fscoordinator rp(sys:get_status(riak_repl2_fscoordinator)).
  • riak_repl2_fscoordinator_serv rp(sys:get_status(riak_repl2_fscoordinator_serv)).

Realtime issues

  • riak_repl2_rtq rp(sys:get_status(riak_repl2_rtq))