Skip to content

Releases: StarRocks/starrocks

Release notes 2.2.0

25 May 13:13
671149f
Compare
Choose a tag to compare

New Features

  • [Preview] Resource groups are supported. By using resource groups to control CPU and memory usage, StarRocks can achieve resource isolation and rational use of resources when different tenants perform complex and simple queries in the same cluster.
  • [Preview] Java UDFs (user-defined functions) are supported. StarRocks supports writing UDFs in Java, extending StarRocks' functions.
  • [Preview] Primary key model supports partial updates when data is loaded to the primary key model using Stream Load, Broker Load, and Routine Load. In real-time data update scenarios such as updating orders and joining multiple streams, partial updates allow users to update only a few columns.
  • [Preview] JSON data types and JSON functions are supported.
  • External tables based on Apache Hudi are supported, which further improves data lake analytics experience.
  • The following functions are supported:
    • ARRAY functions, including array_agg, array_sort, array_distinct, array_join, reverse, array_slice, array_concat, array_difference, array_overlap, and array_intersect
    • BITMAP functions, including bitmap_max and bitmap_min
    • Other functions, including retention and square

Improvement

  • CBO's Parser and Analyzer are reconstructed, code structure is optimized and syntax such as Insert with CTE is supported. So the performance of complex queries is optimized, such as those queries reusing common table expression (CTE).
  • The query performance of object storage-based (AWS S3, Alibaba Cloud OSS, Tencent COS) Apache Hive external table is optimized. After optimization, the performance of object storage-based queries is comparable to that of HDFS-based queries. Also, late materialization of ORC files is supported, improving query performance of small files.
  • When external tables are used to query Apache Hive, StarRocks supports automatic and incremental updating of cached metastore data by consuming Hive Metastore events, such as data changes and partition changes. Moreover, it also supports querying DECIMAL and ARRAY data in Apache Hive.
  • The performance of UNION ALL operator is optimized, delivering improvement of up to 2-25 times.
  • The pipeline engine which can adaptively adjust query parallelism is released, and its profile is optimized. The pipeline engine can improve performance for small queries in high concurrent scenarios.
  • StarRocks supports the loading of CSV files with multi-character row delimiters.

Bug Fixes

The following bugs are fixed:

  • Deadlocks occur when data is loaded and changes are committed into tables based on Primary Key model. #4998
  • Some FE (including BDBJE) stability issues. #4428, #4666, #2
  • The return value overflows when the SUM function is used to calculate a large amount of data. #3944
  • The return values of ROUND and TRUNCATE functions have precision issues. #4256
    Some bugs detected by SQLancer. Please see SQLancer related issues.

Others

  • The Flink connector flink-connector-starrocks supports Flink 1.14.

Release notes 2.0.5

14 May 03:56
Compare
Choose a tag to compare

Release date: May 13, 2022
Upgrade recommendation: Some critical bugs related to the correctness of stored data or data queries have been fixed in this version. It is recommended that you upgrade your StarRocks cluster in time.

Bug Fixes

The following bugs are fixed:

  • [Critical Bug] Data may be lost as a result of BE failures. This bug is fixed by introducing a mechanism that is used to publish a specific version to multiple BEs at a time. #3140
  • [Critical Bug] If tablets are migrated in specific data ingestion phases, data continues to be written to the original disk on which the tablets are stored. As a result, data is lost, and queries cannot be run properly. #5160
  • [Critical Bug] When you run queries after you perform multiple DELETE operations, you may obtain incorrect query results if optimization on low-cardinality columns is performed for the queries. #5712
  • [Critical Bug] If a query contains a JOIN clause that is used to combine a column with DOUBLE values and a column with VARCHAR values, the query result may be incorrect. #5809
  • In certain circumstances, when you load data into your StarRocks cluster, some replicas of specific versions are marked as valid by the FEs before taking effect. At this time, if you query data of the specific versions, StarRocks cannot find the data and reports errors. #5153
  • If a parameter in the SPLIT function is set to NULL, the BEs of your StarRocks cluster may stop running. #4092
  • After your cluster is upgraded from Apache Doris 0.13 to StarRocks 1.19.x and keeps running for a period of time, a further upgrade to StarRocks 2.0.1 may fail. #5309

Thanks to:

@ABingHuang, @Astralidea, @HangyuanLiu, @Pslydhh, @Seaven, @Youngwb, @adzfolc, @decster, @gengjun-git, @kangkaisen, @mergify, @miomiocat, @mofeiatwork, @rickif, @satanson, @sevev, @stdpain

Release notes 2.1.6

11 May 14:13
d44c230
Compare
Choose a tag to compare

Release date: May 10, 2022

Bug Fixes

The following bugs are fixed:

  • When you run queries after you perform multiple DELETE operations, you may obtain incorrect query results if optimization on low-cardinality columns is performed for the queries. #5712
  • If tablets are migrated in specific data ingestion phases, data continues to be written to the original disk on which the tablets are stored. As a result, data is lost, and queries cannot be run properly. #5160
  • If you covert values between the DECIMAL and STRING data types, the return values may be in an unexpected precision. #5608
  • If you multiply a DECIMAL value by a BIGINT value, an arithmetic overflow may occur. A few adjustments and optimizations are made to fix this bug. #4211

Thanks to

@ABingHuang, @Astralidea, @HangyuanLiu, @Seaven, @ZiheLiu, @caneGuy, @gengjun-git, @mergify, @satanson, @sevev, @silverbullet233, @stdpain

Release notes 2.1.5

27 Apr 12:10
2a5c43f
Compare
Choose a tag to compare

Release date: April 27, 2022

BugFix

The following bugs are fixed:

  • The calculation result is not correct when decimal multiplication overflows. After the bug is fixed, NULL is returned when decimal multiplication overflows.
  • When statistics have a considerable deviation from the actual statistics, the priority of Collocate Join can be lower than Broadcast Join. As a result, the query planner may not choose Colocate Join as the more appropriate Join strategy. #4817
  • Query fails because the plan for complex expressions is wrong when there are more than 4 tables to join.
  • BEs may stop working under Shuffle Join when the shuffle column is a low-cardinality column. #4890
  • BEs may stop working when the SPLIT function uses a NULL parameter. #4092

Thanks to:

@ABingHuang, @Astralidea, @HangyuanLiu, @Linkerist, @Seaven, @Youngwb, @adzfolc, @chaoyli, @decster, @gengjun-git, @kangkaisen, @liuyehcf, @meegoo, @mergify, @miomiocat, @mofeiatwork, @rickif, @satanson, @sevev, @stdpain, @trueeyu, @wyb

Release notes 2.0.4

18 Apr 03:07
ca947b1
Compare
Choose a tag to compare

Release date: April 18, 2022

Bug Fixes

The following bugs are fixed:

  • After deleting columns, adding new partitions, and cloning tablets, the columns' unique ids in old and new tablets may not be the same, which may cause BE to stop working because the system uses a shared tablet schema. #4514
  • When data is loading to a StarRocks external table, if the configured FE of the target StarRocks cluster is not a Leader, it will cause the FE to stop working. #4573
  • Query results may be incorrect, when a Duplicate Key table performs schema change and creates materialized view at the same time. #4839
  • The problem of possible data loss due to BE failure (solved by using Batch publish version). #3140

Release notes 2.1.4

12 Apr 09:56
d965a4f
Compare
Choose a tag to compare

Release date: April 8, 2022

New Feature

  • The UUID_NUMERIC function is supported, which returns a LARGEINT value. Compared with UUID function, the performance of UUID_NUMERIC function can be improved by nearly 2 orders of magnitude.

BugFix

The following bugs are fixed:

  • After deleting columns, adding new partitions, and cloning tablets, the columns' unique ids in old and new tablets may not be the same, which may cause BE to stop working because the system uses a shared tablet schema. #4514
  • When data is loading to a StarRocks external table, if the configured FE of the target StarRocks cluster is not a Leader, it will cause the FE to stop working. #4573
  • The results of CAST function are different in StarRocks version 1.19 and 2.1. #4701
  • Query results may be incorrect, when a Duplicate Key table performs schema change and creates materialized view at the same time. #4839

Release notes 2.1.3

12 Apr 09:54
0881cb2
Compare
Choose a tag to compare

Release date: March 19, 2022

Bug Fixes

The following bugs are fixed:

  • The problem of possible data loss due to BE failure (solved by using Batch publish version). #3140
  • Some queries may cause memory limit exceeded errors due to inappropriate execution plans.
  • The checksum between replicas may be inconsistent in different compaction processes. #3438
  • Query may fail in some situation when JSON reorder projection is not processed correctly. #4056

Release notes 2.0.3

14 Mar 08:26
Compare
Choose a tag to compare

Release date: March 14, 2022

BugFix

The following bugs are fixed:

  • Query fails when BE nodes are in suspended animation.
  • Query fails when there is no appropriate execution plan for single-tablet table joins. #3854
  • A deadlock problem may occur when an FE node collects information to build a global dictionary for low-cardinality optimization. #3839

Release notes 2.1.2

14 Mar 08:27
Compare
Choose a tag to compare

Release date: March 14, 2022

BugFix

The following bugs are fixed:

  • In a rolling upgrade from version 1.19 to 2.1, BE nodes stop working because of unmatched chunk sizes beween two versions. #3834
  • Loading tasks may fail while StarRocks is updating from version 2.0 to 2.1. #3828
  • Query fails when there is no appropriate execution plan for single-tablet table joins. #3854
  • A deadlock problem may occur when an FE node collects information to build a global dictionary for low-cardinality optimization. #3839
  • Query fails when BE nodes are in suspended animation due to deadlock.
  • BI tools cannot connect to StarRocks when the show variables command fails.#3708

Release notes 2.1.0

28 Feb 04:45
1864de0
Compare
Choose a tag to compare

New Features

  • [Preview] StarRocks now supports Iceberg external tables.
  • [Preview] The pipeline engine is now available. It is a new execution engine designed for multicore scheduling. The query parallelism can be adaptively adjusted without the need to set the parallel_fragment_exec_instance_num parameter. This also improves performance in high concurrency scenarios.
  • The CTAS (Create Table As Select) function is supported, making ETL and table creation easier.
  • SQL fingerprint is supported. SQL fingerprint is generated in audit.log, which facilitates the location of slow queries.

Improvements

  • Compaction is optimized. A flat table can contain up to 10,000 columns.
  • The performance of first-time scan and page cache is optimized. Random I/O is reduced to improve first-time scan performance. The improvement is more noticeable if first-time scan occurs on SATA disks. StarRocks' page cache can store original data, which eliminates the need for bitshuffle encoding and unnecessary decoding. This improves the cache hit rate and query efficiency.
  • Schema change is supported in the primary key model. You can add, delete, and modify bitmap indexes by using Alter table.
  • [Preview] The size of a string can be up to 1 MB.
  • JSON load performance is optimized. You can load more than 100 MB JSON data in a single file.
  • Bitmap index performance is optimized.
  • The performance of StarRocks Hive external tables is optimized. Data in the CSV format can be read.
  • DEFAULT CURRENT_TIMESTAMP is supported in the create table statement. #1193
  • StarRocks supports the loading of CSV files with multiple delimiters.

BugFix
The following bugs are fixed:

  • Auto __op mapping does not take effect if jsonpaths is specified in the command used for loading JSON data. #3405
  • BE nodes fail because the source data changes during data loading using Broker Load. #3481
  • Some SQL statements report errors after materialized views are created. #2975
  • The routine load does not work due to quoted jsonpaths. #2488
  • Query concurrency decreases sharply when the number of columns to query exceeds 200.

Behavior Changes

  • The API for disabling a Colocation Group is changed from DELETE /api/colocate/group_stable to POST /api/colocate/group_unstable.

Others

  • flink-connector-starrocks is now available for Flink to read StarRocks data in batches. This improves data read efficiency compared to the JDBC connector.