Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread blocked #153

Open
zhaoyvli opened this issue Jun 27, 2022 · 14 comments
Open

Thread blocked #153

zhaoyvli opened this issue Jun 27, 2022 · 14 comments
Labels
question Further information is requested

Comments

@zhaoyvli
Copy link

image

When I execute an SQL query that takes a long time, I get thread blocked,
and creating a new query will fail

@zhicwu
Copy link
Contributor

zhicwu commented Jun 27, 2022

Hi @MrZhao1024B, it's a warning from Vert.x but it should not block your next query. Are you using latest JDBC bridge?

@zhicwu zhicwu added the question Further information is requested label Jun 27, 2022
@zhaoyvli
Copy link
Author

zhaoyvli commented Jun 28, 2022

Hi@zhicwu, I'm using the latest JDBC bridge. Only execute a query that takes a long time to return results, Creating a new query will fail, and I need to wait for the task to finish before executing a new query.
My guess is that the task has not received any data from the database, causing the new task to wait.

@zhicwu
Copy link
Contributor

zhicwu commented Jun 28, 2022

There was an issue(#113) like this but it's been fixed. Just want to double confirm with you: 1) by latest version you meant 2.1.0, right? and 2) if connection pool is used, its maximum size is greater than one.

Will try if I can reproduce the issue on 2.1.0.

@zhaoyvli
Copy link
Author

zhaoyvli commented Jun 28, 2022

My JDBC bridge version is 2.1.0, clickhouse version is 21.12.3.32.

"maximumPoolSize": 5

@zhicwu
Copy link
Contributor

zhicwu commented Jun 28, 2022

Thanks for the extra information. What's your query looks like? How many jdbc table functions appear in your query? 3 connections will be used for one query like below:

select * from jdbc('ds1', 'query1')
union all
select * from jdbc('ds1', 'query2')
union all
select * from jdbc('ds1', 'query3')

@zhaoyvli
Copy link
Author

My query has only one JDBC table function.
select * from jdbc ('ds1', 'query1')

Tests have found that.

  1. First, a query is executed:
    select * from jdbc('ds1', 'select * from table_01')
    Second, execute the query:
    select * from jdbc('ds1', 'select * from table_01')
    If table_01 has 100 million entries, I can query two SQL processes in the Clickhouse and table_01 databases

  2. First, a query is executed:
    select cnt from jdbc('ds1', 'select count(1) as cnt from table_01')
    Second, execute the query:
    select * from jdbc('ds1', 'select * from table_01')
    If table_01 has 100 million entries. I can query two SQL processes in Clickhouse, but I can only query the first SQL process in table_01.

@zhicwu
Copy link
Contributor

zhicwu commented Jun 28, 2022

I was not able to reproduce the issue. What's the target database? It looks like the first query leads to a table lock?

@zhaoyvli
Copy link
Author

My target database is Oracle. But my first query using oracle, and my second query using sqlserver will have the same problem. So the reason is not because the table is locked.

@zhicwu
Copy link
Contributor

zhicwu commented Jun 28, 2022

I see what's going on there. By design, JDBC bridge will perform scheme inferring for each query. This is to help Clickhouse to understand the table schema. Usually it simply issue the original query with fetch size and max rows set to 1. This works well for the first case but not the second(not sure why count query is so slow in your environment). To avoid the extra query, you have to either use named query(query in json config file), or issue a query like select * from jdbc('ds1', 'a String, b Int32, ...', 'your query').

@zhaoyvli
Copy link
Author

zhaoyvli commented Jun 28, 2022

Thank you for your advice. When I execute query select cnt from jdbc('ds1','a String, b Int32', 'select count(1) as cnt from table_01'), creating a new query is successful.

My count query was so slow. A new query should not be blocked because of this.

@lichenglin
Copy link

It seems that the schema query is a serial method?

when there is a long query such as select * from jdbc('source','select count(1) from huge_table')

all the query is blocked.

but when running select * from jdbc('source','c Int64','select count(1) as c from huge_table')

It work well .

@lichenglin
Copy link

I think this is maybe the root cause.

In NamedDataSource

I found this
columnsCache.get(query, k -> { return inferTypes(schema, query, this.loadSavedQueryAsNeeded(k, params), params); })

When running a long sql. get may block the processs.

It's metioned in caffeine docs , see get javadoc

If the specified key is not already associated with a value, attempts to compute its value using the given mapping function and enters it into this cache unless null. The entire method invocation is performed atomically, so the function is applied at most once per key. **Some attempted update operations on this cache by other threads may be blocked while the computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this cache.**

@peder1001
Copy link

Any progress on making get stop blocking the process?

@lichenglin
Copy link

I modify JdbcBridgeVerticle
from

router.post("/columns_info").produces(RESPONSE_CONTENT_TYPE).handler(queryTimeoutHandler) .blockingHandler(this::handleColumnsInfo,SERIAL_MODE);

to

router.post("/columns_info").produces(RESPONSE_CONTENT_TYPE).handler(queryTimeoutHandler) .handler(this::handleColumnsInfo);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants