Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination return incorrect when using eq.Like with AllowFiltering() #297

Open
trunglh88 opened this issue Dec 24, 2024 · 3 comments
Open

Comments

@trunglh88
Copy link

trunglh88 commented Dec 24, 2024

Hi team,

I have table like this

create table sample_table
(
    bucket         text,
    user_id        uuid,
    username       text,    
    keywords       text,    
    primary key ((bucket, user_id, username))
)
    with caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
     and compaction = {'class': 'SizeTieredCompactionStrategy'}
     and compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
     and speculative_retry = '99.0PERCENTILE';

When i run SELECT command using cqlsh

SELECT *
FROM keyspace.sample_table
WHERE user_id = <user_id_here>
  AND bucket = 'bucket_name' AND keywords LIKE '%cat%' ALLOW FILTERING;

then result is 21 rows.

If i call API without keyword query params then result return is correct (10 rows per page). But if call with keyword is cat then result return only 6 rows per page (i set PageSize := 10)

Image

Here my code:

func GetTracking(c *gin.Context) {

	var pageState []byte
	//var result []models.PetSimulator

	keyword := c.Request.URL.Query().Get("keyword")
	searchPattern := "%%%s%%"

	// get pagination params from query param
	pageSize, pageState := utils.GetPaginationParam(c)

	// get session from Redis and Database
	redis := database.GetSession(c)
	session := database.GetConnectionSession()

	t := table.New(metadata.PetSimulatorMaterializedView)

	bindMap := qb.M{"user_id": redis.Id, "bucket": database.Bucket}
	builder := qb.Select(t.Name()).
		Where(qb.Eq("user_id")).
		Where(qb.Eq("bucket"))

	if keyword != "" {
		searchPattern = fmt.Sprintf(searchPattern, keyword)
		bindMap["keywords"] = searchPattern

		builder.
			Where(qb.Like("keywords")).
			AllowFiltering()
	}

	stmt, cols := builder.ToCql()

	iter := session.Query(stmt, cols).
		BindMap(bindMap).
		PageSize(pageSize).
		PageState(pageState).
		Iter()

	defer iter.Close()

	//Set next page state.
	pageState = iter.PageState()

	//err := iter.Select(&result)
	//if err != nil {
	//	return
	//}

	var result []*models.PetSimulator

	scanner := iter.Scanner()
	for scanner.Next() {
		objModel := &models.PetSimulator{}

		if err := scanner.Scan(
			&objModel.Bucket,
			&objModel.UserId,
			&objModel.Username,			
			&objModel.Keywords,			
		); err != nil {
			panic(err)
		}

		result = append(result, objModel)
	}

	// parse next page token string
	nextPageToken := utils.GetNextPageToken(iter)

	//rows := result
	//if len(result) == 0 {
	//	rows = []models.PetSimulator{}
	//}

	response := models.Response{
		Count:     len(result),
		NextToken: nextPageToken,
		Rows:      result,
	}

	c.JSON(http.StatusOK, response)
}

i don't know why when query build with

builder.
	Where(qb.Like("keywords")).
	AllowFiltering()

then response rows always < 10 records.

[cqlsh 6.0.23.dev9+gb09bc79 | Scylla 6.2.1-0.20241106.a3a0ffbcd015 | CQL spec 3.3.1 | Native protocol v4]

@dkropachev
Copy link
Collaborator

dkropachev commented Dec 24, 2024

  1. I surprised to see it is working without allow filtering (when keyword == ""), it server should require ALLOW FILTERING in such case, unless table definition you provided is not what you are running on.
  2. Scylla does not guarantee specific number of rows when it is executing with ALLOW FILTERING reason to that is that it sends request to every node and node does paging, but it could happen that this particular node knows only 6 records then coordinator returns you these 6 records. It is known and expected behavior.

Not related to the question I see that you have username in tyour primary key in pair with user_id.
It does not make sense to have them both there.

@trunglh88
Copy link
Author

@dkropachev thanks for explaining.

I am learning about Scylla to move from MariaDB to it.

If i update my table to primary key ((bucket, user_id), username) then all records same bucket and user_id (same Partition Key) will on the same node, right ?

@dkropachev
Copy link
Collaborator

@dkropachev thanks for explaining.

I am learning about Scylla to move from MariaDB to it.

Good for you.

If i update my table to primary key ((bucket, user_id), username) then all records same bucket and user_id (same Partition Key) will on the same node, right ?

Correct, all records with same bucket and user_id, will end up on the same node.

I recommend you to go through Scylla university course regarding data modeling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants