Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[go/mysql] use sync.Pool for reusing bufio readers and writers #4186

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions go/mysql/bufio_pool.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
package mysql

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests please :)

lots of random stuff and make sure we get the right string out the right side.

also i'm curious about error conditions etc


import (
"bufio"
"io"
"sync"
)

// Reader implementation

type bufioReader interface {
Read([]byte) (int, error)
Reset(io.Reader)
}

var readersPool = sync.Pool{New: func() interface{} { return bufio.NewReaderSize(nil, connBufferSize) }}

func (pbr *poolBufioReader) getReader() {
if pbr.br != nil {
return
}
pbr.br = readersPool.Get().(*bufio.Reader)
pbr.br.Reset(pbr.r)
}

func (pbr *poolBufioReader) putReader() {
if pbr.br == nil {
return
}
// remove reference
pbr.br.Reset(nil)
readersPool.Put(pbr.br)
pbr.br = nil
}

type poolBufioReader struct {
r io.Reader
br *bufio.Reader
}

func newReader(r io.Reader) bufioReader {
return &poolBufioReader{
r: r,
}
}

func (pbr *poolBufioReader) Read(b []byte) (int, error) {
pbr.getReader()
LK4D4 marked this conversation as resolved.
Show resolved Hide resolved
n, err := pbr.br.Read(b)
if pbr.br.Buffered() == 0 {
pbr.putReader()
}
return n, err
}

func (pbr *poolBufioReader) Reset(r io.Reader) {
pbr.putReader()
pbr.r = r
}

// Writer implementation

var writersPool = sync.Pool{New: func() interface{} { return bufio.NewWriterSize(nil, connBufferSize) }}

type bufioWriter interface {
Write([]byte) (int, error)
Reset(io.Writer)
Flush() error
}

func (pbw *poolBufioWriter) getWriter() {
if pbw.bw != nil {
return
}
pbw.bw = writersPool.Get().(*bufio.Writer)
pbw.bw.Reset(pbw.w)
}

func (pbw *poolBufioWriter) putWriter() {
if pbw.bw == nil {
return
}
// remove reference
pbw.bw.Reset(nil)
writersPool.Put(pbw.bw)
pbw.bw = nil
}

type poolBufioWriter struct {
w io.Writer
bw *bufio.Writer
}

func newWriter(w io.Writer) bufioWriter {
return &poolBufioWriter{
w: w,
}
}

func (pbw *poolBufioWriter) Write(b []byte) (int, error) {
pbw.getWriter()
return pbw.bw.Write(b)
}

func (pbw *poolBufioWriter) Reset(w io.Writer) {
pbw.putWriter()
pbw.w = w
}

func (pbw *poolBufioWriter) Flush() error {
if pbw.bw == nil {
return nil
}
err := pbw.bw.Flush()
pbw.putWriter()
return err
}
9 changes: 4 additions & 5 deletions go/mysql/conn.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ limitations under the License.
package mysql

import (
"bufio"
"fmt"
"io"
"net"
Expand Down Expand Up @@ -136,8 +135,8 @@ type Conn struct {
ClientData interface{}

// Packet encoding variables.
reader *bufio.Reader
writer *bufio.Writer
reader bufioReader
writer bufioWriter
sequence uint8

Copy link

@danieltahara danieltahara Sep 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github won't let me comment on other lines, so here i go. (https://github.com/vitessio/vitess/blob/master/go/mysql/conn.go#L195)

as mentioned, since most connections sit waiting for read, "pooling" doesn't help much.
however, we can note that every read must read the 4 byte length header, and profiles validate the intuition that most idle connections are blocking on the Read(..) from an empty network buffer.

this leaves us a few options:

  1. write a read header function, which does a direct read from the base connection
  2. initialize all conns with a 4 byte buffered reader.
    3. have a pool of 4 byte reader buffers for reading headers. (this is pointless)

only once we've read the header do we then use the pooled buffered reader.

in both cases i think factoring out the header read is useful so then we don't risk someone messing up the header read by using the bufio.

@sougou thoughts or other ideas? option 1 is the simplest, as we are already allocating the 4 byte arrays -- and no implementation of buffering can avoid the degenerate case problem of 1 byte being sent at a time. i don't like option 2 as we'll get funky behavior with the two wrapped bufios.

the cost to all of this is 1 extra syscall per request served, which, for how we are using the mysql server is a huge boon. it's obviously a trade between throughput and tail latencies, but i think removing the unexpected and hard to debug GC problem and lowering steady state memory usage makes that tradeoff worthwhile. It's easier to reason about a consistent QoS IMO, and frankly if anyone actually hits throughput problems from the double syscall, i'd love to chat with them and see how we can make this faster :)

all that said, there is an asymmetry between the needs of those using the mysql connection as a client and those using it as part of the server. for that reason it might be worth making the connection take an option, and have the server set that option for its connections (which then makes it instantiate pooled bufios and do the header read directly from the base connection) but the client leave it off by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In spite of the benchmarks, I don't think the pooling will give us anything in real life. This is because most connections will generally be idle, and during that time, they will be possessing a buffer waiting on a read.

If we went with the 4-byte approach, we'll need to perform some real-life benchmarks to evaluate the impact of the extra syscall. If we find that it's worth it, then we should not use bufio at all:
All waiting reads can fetch into a 4-byte buffer, and then we can directly read the rest of the data into the target slice. There is no benefit to using bufio in this case.

Another option is to reduce the size of bufio to 4K, which is what go recommends. With the removal of the other two buffers, the overhead of a connection would drop to 4K, which can be more easily accommodated.

We should just look at dropping connBuffSize and use go's default anyways.

// fields contains the fields definitions for an on-going
Expand Down Expand Up @@ -169,8 +168,8 @@ func newConn(conn net.Conn) *Conn {
return &Conn{
conn: conn,

reader: bufio.NewReaderSize(conn, connBufferSize),
writer: bufio.NewWriterSize(conn, connBufferSize),
reader: newReader(conn),
writer: newWriter(conn),
sequence: 0,
}
}
Expand Down