Hello Everyone,
I want to share our solution for Google Hash Code 2020. It is the code repo of team titanium-white for Google Hash Code 2020 Online Qualification Round. We have writen our code in Python.
Code can be reached from here
Code Basically, the problem class is the backbone of the system.
It handles the inputs and outputs with functions __init__ and dump. It read the input files and creates objects according to that. I think the one of the thing we did well is this __init__ function. It create all the objects like books, libraries even if book2score dictionary. It also handles the solving opeartion. It iterate trough the days and get max predicted score from every library available. When the libraries calculating the max pred scores, they get the current state as input. Therefore their predictionsa are more accurate. After the library is selected the state is updated such as day, already scanned books and state of the selected library. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 # main.py from library import Library, Book import numpy as np class Problem(object): def __init__(self, filename): print('file--', filename) f = open(filename) l = f.readline().split(' ') self.filename = filename self.nBooks, self.nLibs, self.nDays = int(l[0]), int(l[1]), int(l[2]) self.books = [ Book(i,int(score)) for i, score in enumerate(f.readline().split(' '))] sumb = 0 for b in self.books: sumb += b.score print( sumb/1000000 ) self.book2Score = {book.id: int(book.score) for book in self.books} self.libs = [] for libId in range(self.nLibs): l = [int(i) for i in f.readline().split(' ')] nBooks, nSign, nScan = l[0], l[1], l[2] books = [ Book(int(i), int(self.book2Score[int(i)])) for i in f.readline().split(' ') ] lib = Library(libId, nBooks, nSign, nScan, books) self.libs.append(lib) self.pri() def solve(self): t = 0 solution = [] readBookSet = set() while t < self.nDays: print('-----', t) scoreList = [] readBookList = [] for lib in self.libs: if lib.registered == False: score, curBookList = lib.predMaxScore(self.nDays - t, readBookSet) scoreList.append( score ) readBookList.append( curBookList ) else: scoreList.append(0) readBookList.append( [ ] ) if len(scoreList) == 0: break if max(scoreList) == 0: break libIndex = scoreList.index(max(scoreList)) self.libs[libIndex].registeredDay = t self.libs[libIndex].registered = True self.libs[libIndex].solBooks = readBookList[libIndex] readBookSet = readBookSet.union(readBookList[libIndex]) solution.append(self.libs[libIndex]) t += self.libs[libIndex].nSign print([(lib.id, lib.nSign, lib.registeredDay) for lib in solution]) return solution def dump(self, solution): f = open(self.filename.replace('data/', 'out/'), 'w+') f.write('{}\n'.format(len(solution))) for lib in solution: books = lib.solBooks f.write('{} {}\n'.format( lib.id, len(books) )) s = "" for book in books: s+= str(book) + ' ' s+='\n' f.write(s) f.close() def pri(self): print('------') print(self.nBooks, self.nLibs, self.nDays) #print(self.libs) print('------') if __name__ == "__main__": p = Problem('data/a.txt') solution = p.solve() p.dump(solution) The Second critical class is Library class. The vital function in our implementation is predMaxScore. We are doing calculating this with total book score / (sign day*sign day). Because, during our experiments, we saw that the sign days length is quite important especially for some datasets. Total book score is simply if in that day the library is chosen, how much of the books can be scanned until the end of total days. Of course these books are selecting according to their scores and they should be not scanned before.
...