HIVE-27328: Acid dirCache is not invalidated in TezAMs while dropping table#6309
HIVE-27328: Acid dirCache is not invalidated in TezAMs while dropping table#6309abstractdog merged 2 commits intoapache:masterfrom
Conversation
c371ea4 to
9cc8925
Compare
|
oh, this causes a lot of qtest noise because of the create_time value added to TableDesc properties, need to think about this |
9cc8925 to
38c5e14
Compare
UPDATE: solved from MapWork |
80bf84e to
4cc0c07
Compare
ayushtkn
left a comment
There was a problem hiding this comment.
Thanx @abstractdog for chasing this, dropped minor comments, rest looks good
| * @param tableCreateTime | ||
| * table creation time to store, represented as a string | ||
| */ | ||
| public static void setTableCreateTime(Configuration conf, Table table) { |
| public static void setTableCreateTime(Configuration conf, Table table) { | ||
| Objects.requireNonNull(table, "Cannot get table create time. Table object is expected to be non-null."); | ||
| String tableCreateTime = String.valueOf(table.getCreateTime()); | ||
| String fullTableName = String.format("%s.%s", table.getDbName(), table.getTableName()); |
There was a problem hiding this comment.
Can we use:
table.getFullyQualifiedName();
or
TableName.getDbTable(table.getDbName(), table.getTableName());
| Objects.requireNonNull(table, "Cannot get table create time. Table object is expected to be non-null."); | ||
| String tableCreateTime = String.valueOf(table.getCreateTime()); | ||
| String fullTableName = String.format("%s.%s", table.getDbName(), table.getTableName()); | ||
| conf.set(String.format("%s.%s", fullTableName, CREATE_TIME), tableCreateTime); |
There was a problem hiding this comment.
We could do setInt, rather than converting it into String and then setting
conf.setInt(String.format("%s.%s", fullTableName, CREATE_TIME), table.getCreateTime());
| public static int getTableCreateTime(Configuration conf, String tableName) { | ||
| String createTime = conf.get(String.format("%s.%s", tableName, CREATE_TIME)); | ||
| return createTime == null ? 0 : Integer.parseInt(createTime); |
|
|
||
| // Check whether the table was re-created after being stored in the cache. | ||
| // The value null check avoids a noisy log message during the initial lookup, when no cache entry exists. | ||
| if (value != null && tableCreateTimeInCache < tableCreateTime) { |
There was a problem hiding this comment.
The value is in int. Doubting about the Interger Overflow case, Should we just do
tableCreateTimeInCache != tableCreateTime
| mapWork.configureJobConf(jobConf); | ||
|
|
||
| // Then the table's create time should be present in the JobConf | ||
| String fullTableName = dbName + "." + tableName; |
There was a problem hiding this comment.
String fullTableName = TableName.getDbTable(dbName, tableName);
|
|
||
| @Override | ||
| public void configureJobConf(JobConf job) { | ||
| Table table = getConf().getTableMetadata(); |
There was a problem hiding this comment.
can getConf() be null?, maybe we can do, if there is such a chance
public void configureJobConf(JobConf job) {
if (getConf() != null && getConf().getTableMetadata() != null) {
Utilities.setTableCreateTime(job, getConf().getTableMetadata());
}
4cc0c07 to
ea68d9d
Compare
|
thanks @ayushtkn for your comments, all made sense, fixed them! |
|



What changes were proposed in this pull request?
See jira for initial analysis.
The patch introduces a table createTime check while using the acid dir cache. This create_time property is propagated to the TezAM in the vertex-level JobConf, making it able to invalidate stale entries that belong to the previous instance of the same table (before DROP).
An RPC call-based solution wouldn't work because HS2 has no such interface to the AMs, and introducing such functionality to the TezClient/DagClient would be an epic hack, so it's not an option.
Why are the changes needed?
Because stale cache can cause problems, that are hard to investigate and that weren't taken care of by HIVE-26060.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually tested with minihs2.
looking for log entry in the AM log: